linux-kernel - Re: [PATCH] ALSA: hda: Request driver probe from an async task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <s5hlgdertjg.wl-tiwai@suse.de>
Date:   Mon, 23 Apr 2018 14:33:55 +0200
From:   Takashi Iwai <tiwai@...e.de>
To:     Paul Menzel <pmenzel+alsa-devel@...gen.mpg.de>
Cc:     Jaroslav Kysela <perex@...ex.cz>, alsa-devel@...a-project.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ALSA: hda: Request driver probe from an async task

On Mon, 23 Apr 2018 14:30:36 +0200,
Paul Menzel wrote:
> 
> Dear Takashi,
> 
> 
> On 04/23/18 14:21, Takashi Iwai wrote:
> > On Mon, 23 Apr 2018 14:05:52 +0200,
> > Paul Menzel wrote:
> >>
> >> From: Paul Menzel <pmenzel@...gen.mpg.de>
> >> Date: Sat, 24 Mar 2018 09:28:43 +0100
> >>
> >> On an ASRock E350M1, with Linux 4.17-rc1 according to `initcall_debug`
> >> calling `azx_driver_init` takes sometimes more than a few milliseconds,
> >> and up to 200 ms.
> >>
> >> ```
> >> [    2.892598] calling  azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 218
> >> [    2.943002] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel]
> >> returned 0 after 49195 usecs
> >> ```
> >>
> >> Trying to execute the Linux kernel in less than 500 ms, this is quite a
> >> hold-up, and therefore request the probe from an async task.
> >>
> >> With this change, the test shows, that the function returns earlier.
> >>
> >> ```
> >> [    3.254800] calling  azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 227
> >> [    3.254887] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel]
> >> returned 0 after 66 usecs
> >> ```
> >>
> >> The same behavior is visible on a Dell OptiPlex 7010. The longer times
> >> seem to happen, when the module *e1000e* is probed during the same time.
> >>
> >> Signed-off-by: Paul Menzel <pmenzel@...gen.mpg.de>
> >
> > What actually took so long?  Could you analyze further instead of
> > blindly putting the flag?
> 
> Well, I am not sure. Could you please give me hints, how to debug this
> further? Is there some debug flag?

Usually perf would help, but even a simple printk() should suffice to
see what's going on there :)

> I am only aware of the Ftrace framework, but in my experience it also
> skews the timings quite a bit, so might not be the best choice.

We know that there are some cases where the codec / controller
communication stalls on the recent Coffee Lake or such platforms.
But quite not sure how it happens.

Moving the stuff into async just moves something ugly, and it's no
fix, per se, if such a long delay itself is unexpected.


thanks,

Takashi