lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1711071551330.1716@nanos>
Date:   Tue, 7 Nov 2017 16:07:11 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Sagi Grimberg <sagi@...mberg.me>
cc:     Jes Sorensen <jsorensen@...com>,
        Tariq Toukan <tariqt@...lanox.com>,
        Saeed Mahameed <saeedm@....mellanox.co.il>,
        Networking <netdev@...r.kernel.org>,
        Leon Romanovsky <leonro@...lanox.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Kernel Team <kernel-team@...com>,
        Christoph Hellwig <hch@....de>
Subject: Re: mlx5 broken affinity

On Sun, 5 Nov 2017, Sagi Grimberg wrote:
> > > > This wasn't to start a debate about which allocation method is the
> > > > perfect solution. I am perfectly happy with the new default, the part
> > > > that is broken is to take away the user's option to reassign the
> > > > affinity. That is a bug and it needs to be fixed!
> > > 
> > > Well,
> > > 
> > > I would really want to wait for Thomas/Christoph to reply, but this
> > > simple change fixed it for me:
> > > --
> > > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > > index 573dc52b0806..eccd06be5e44 100644
> > > --- a/kernel/irq/manage.c
> > > +++ b/kernel/irq/manage.c
> > > @@ -146,8 +146,7 @@ bool irq_can_set_affinity_usr(unsigned int irq)
> > >   {
> > >          struct irq_desc *desc = irq_to_desc(irq);
> > > 
> > > -       return __irq_can_set_affinity(desc) &&
> > > -               !irqd_affinity_is_managed(&desc->irq_data);
> > > +       return __irq_can_set_affinity(desc);
> > 
> > Which defeats the whole purpose of the managed facility, which is _not_ to
> > break the affinities on cpu offline and bring the interrupt back on the CPU
> > when it comes online again.
> > 
> > What I can do is to have a separate flag, which only uses the initial
> > distribution mechanism, but I really want to have Christophs opinion on
> > that.
> 
> I do agree that the user would lose better cpu online/offline behavior,
> but it seems that users want to still have some control over the IRQ
> affinity assignments even if they lose this functionality.

Depending on the machine and the number of queues this might even result in
completely losing the ability to suspend/hibernate because the number of
available vectors on CPU0 is not sufficient to accomodate all queue
interrupts.

> Would it be possible to keep the managed facility until a user overrides
> an affinity assignment? This way if the user didn't touch it, we keep
> all the perks, and in case the user overrides it, we log the implication
> so the user is aware?

A lot of things are possible, the question is whether it makes sense. The
whole point is to have resources (queues, interrupts etc.) per CPU and have
them strictly associated.

Why would you give the user a knob to destroy what you carefully optimized?

Just because we can and just because users love those knobs or is there any
real technical reason?

Thanks,

	tglx




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ