lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 30 Apr 2021 12:14:08 -0400
From:   Nitesh Lal <nilal@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Jesse Brandeburg <jesse.brandeburg@...el.com>,
        "frederic@...nel.org" <frederic@...nel.org>,
        "juri.lelli@...hat.com" <juri.lelli@...hat.com>,
        Marcelo Tosatti <mtosatti@...hat.com>, abelits@...vell.com,
        Robin Murphy <robin.murphy@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "sfr@...b.auug.org.au" <sfr@...b.auug.org.au>,
        "stephen@...workplumber.org" <stephen@...workplumber.org>,
        "rppt@...ux.vnet.ibm.com" <rppt@...ux.vnet.ibm.com>,
        "jinyuqi@...wei.com" <jinyuqi@...wei.com>,
        "zhangshaokun@...ilicon.com" <zhangshaokun@...ilicon.com>,
        netdev@...r.kernel.org, chris.friesen@...driver.com
Subject: Re: [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs

On Fri, Apr 30, 2021 at 3:10 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Nitesh,
>
> On Thu, Apr 29 2021 at 17:44, Nitesh Lal wrote:
>
> First of all: Nice analysis, well done!

Thanks, Thomas.

>
> > So to understand further what the problem was with the older kernel based
> > on Jesse's description and whether it is still there I did some more
> > digging. Following are some of the findings (kindly correct me if
> > there is a gap in my understanding):

<snip>

> >
> > I think this explains why even if we have multiple CPUs in the SMP affinity
> > mask the interrupts may only land on CPU0.
>
> There are two issues in the pre rework vector management:
>
>   1) The allocation logic itself which preferred lower numbered CPUs and
>      did not try to spread out the vectors accross CPUs. This was pretty
>      much true for any APIC addressing mode.
>
>   2) The multi CPU affinity support if supported by the APIC
>      mode. That's restricted to logical APIC addressing mode. That is
>      available for non X2APIC up to 8 CPUs and with X2APIC it requires
>      to be in cluster mode.
>
>      All other addressing modes had a single CPU target selected under
>      the hood which due to #1 was ending up on CPU0 most of the time at
>      least up to the point where it still had vectors available.
>
>      Also logical addressing mode with multiple target CPUs was subject
>      to #1 and due to the delivery logic the lowest numbered CPU (APIC)
>      was where most interrupts ended up.
>

Right, thank you for confirming.


Based on this analysis and the fact that with your re-work the interrupts
seems to be naturally spread across the CPUs, will it be safe to revert
Jesse's patch

e2e64a932 genirq: Set initial affinity in irq_set_affinity_hint()

as it overwrites the previously set IRQ affinity mask for some of the
devices?

IMHO if we think that this patch is still solving some issue other than
what Jesse has mentioned then perhaps we should reproduce that and fix it
directly from the request_irq code path.

-- 
Nitesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ