linux-kernel - Re: [RFC/PATCHv2] kernel/irq: allow more precise irq affinity policies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.1009232015180.2416@localhost6.localdomain6>
Date:	Thu, 23 Sep 2010 20:36:35 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Arthur Kepner <akepner@....com>
cc:	linux-kernel@...r.kernel.org,
	Ben Hutchings <bhutchings@...arflare.com>
Subject: Re: [RFC/PATCHv2] kernel/irq: allow more precise irq affinity
 policies

On Thu, 23 Sep 2010, Thomas Gleixner wrote:

> On Wed, 22 Sep 2010, Arthur Kepner wrote:
> 
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index cea0cd9..8fa7f52 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -313,6 +313,17 @@ config NUMA_IRQ_DESC
> >  	def_bool y
> >  	depends on SPARSE_IRQ && NUMA
> >  
> > +config IRQ_POLICY_NUMA
> > +	bool "Assign default interrupt affinities in a NUMA-friendly way"
> > +	def_bool y
> > +	depends on SPARSE_IRQ && NUMA
> > +	---help---
> > +	   When a device requests an interrupt, the default CPU used to
> > +	   service the interrupt will be selected from a node 'near by'
> > +	   the device. Also, interrupt affinities will be spread around
> > +	   the node so as to prevent any single CPU from running out of
> > +	   interrupt vectors.
> > +

I thought more about this and came to the conclusion that this
facility is completely overengineered and mostly useless except for a
little detail.

The only problem which it solves is to prevent that we run out of
vectors on the low numbered cpus when that NIC which insists to create
one irq per cpu starts up.

Fine, I can see that this is a problem, but we do not need this
complete nightmare to solve it. We can do that way simpler.

 1) There is a patch from your coworker to work around that in the low
    level x86 code, which is probably working, but suboptimal and not
    generic

 2) We already know that the NIC requested the irq on node N. So when
    we set it up, we just honour the wish of the driver as long as it
    fits in the default (or modified) affinity mask and restrict the
    affinity to the cpus on that very node.

    That makes a whole lot of sense: The driver already knows on which
    cpus it wants to see the irq, because it allocated queues and
    stuff there.

    So that's probably a 10 lines or less patch do fix that.

So now to the whole other policy horror. That belongs to user space
and can be done in user space today. We do _NOT_ implement policies in
the kernel.

User space knows exactly how many irqs are affine to which cpu, knows
the topology and can do the balancing on its own.

So please go wild and put your nr_irqs * nr_irqs loop into some user
space program.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/