lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1709070731110.2433@nanos>
Date:   Thu, 7 Sep 2017 07:54:09 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Yu Chen <yu.c.chen@...el.com>
cc:     x86@...nel.org, Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Rui Zhang <rui.zhang@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Len Brown <lenb@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Christoph Hellwig <hch@....de>,
        Peter Zijlstra <peterz@...radead.org>,
        Jeff Kirsher <jeffrey.t.kirsher@...el.com>
Subject: Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing
 the idlest CPU

On Thu, 7 Sep 2017, Yu Chen wrote:
> On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> > Can you please apply the debug patch below, boot the machine and right
> > after login provide the output of
> > 
> > # cat /sys/kernel/debug/tracing/trace
> >
>      kworker/0:2-303   [000] ....     9.135467: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 34
>      kworker/0:2-303   [000] ....     9.135476: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 35
>      kworker/0:2-303   [000] ....     9.135484: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 36

<SNIP>

>      kworker/0:2-303   [000] ....     9.762268: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 331
>      kworker/0:2-303   [000] ....     9.762278: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 332
>      kworker/0:2-303   [000] ....     9.762288: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 333

That's 300 vectors.

>  bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03)
> 
> -+-[0000:b2]-+-00.0-[b3-bc]----00.0-[b4-bc]--+-00.0-[b5-b6]----00.0
>  |           |                               +-01.0-[b7-b8]----00.0
>  |           |                               +-02.0-[b9-ba]----00.0
>  |           |                               \-03.0-[bb-bc]--+-00.0
>  |           |                                               +-00.1
>  |           |                                               +-00.2
>  |           |                                               \-00.3
> 
> and they are using i40e driver, the vectors should be reserved by:
> i40e_probe() ->
>   i40e_init_interrupt_scheme() ->
>     i40e_init_msix() ->
>       i40e_reserve_msix_vectors() ->
>         pci_enable_msix_range()
> 
> # ls /sys/kernel/debug/irq/irqs
> 0  10   11  13  142  184  217  259  292  31  33
> 337  339  340  342  344  346  348  350  352  354  356
> 358  360  362  364  366  368  370  372  374  376  378
> 380  382  384  386  388  390  392  394  4  6   7  9
> 1  109  12  14  15   2    24   26   3    32  335
> 338  34   341  343  345  347  349  351  353  355  357
> 359  361  363  365  367  369  371  373  375  377  379
> 381  383  385  387  389  391  393  395  5  67  8

Out of these 300 interrupts exactly 8 randomly selected ones are actively
used. And the other 292 interrupts are just there because it might need
them in the future when the 32 CPU machine gets magically upgraded to 4096
cores at runtime?

Can the i40e people @intel please fix this waste of resources and sanitize
their interrupt allocation scheme?

Please switch it over to managed interrupts so the affinity spreading
happens in a sane way and the interrupts are properly managed on CPU
hotplug.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ