lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 09 Aug 2017 16:15:36 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
Cc:     Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        tglx@...utronix.de, linux-kernel@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH] powerpc: xive: ensure active irqd when setting affinity

Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com> writes:
> Michael Ellerman [mpe@...erman.id.au] wrote:
>> Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com> writes:
>> > From fd0abf5c61b6041fdb75296e8580b86dc91d08d6 Mon Sep 17 00:00:00 2001
>> > From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
>> > Date: Tue, 1 Aug 2017 20:54:41 -0500
>> > Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity
>> >
>> > Ensure irqd is active before attempting to set affinity. This should
>> > make the set affinity code more robust. For instance, this prevents
>> > these messages seen on a 4.12 based kernel when taking cpus offline:
>> >
>> >    [  123.053037264,3] XIVE[ IC 00  ] ISN 2 lead to invalid IVE !
>> >    [   77.885859] xive: Error -6 reconfiguring irq 17
>> >    [   77.885862] IRQ17: set affinity failed(-6).
>> >
>> > The underlying problem with taking cpus offline was fixed in 4.13-rc1 by:
>> >
>> >    commit 91f26cb4cd3c ("genirq/cpuhotplug: Do not migrated shutdown irqs")
>> 
>> So do we still need this? Or is the above only a partial fix?
>
> It would be good to have this fix.
>
> Commit 91f26cb4cd3c fixes the problem, so we wont see the errors with
> that commit applied. But if such a problem were to show up again, xive
> will handle them earlier before hitting those errors.

I'm not sure I'm convinced. We can't handle every possible case of the
higher level code calling us in situations we don't expect.

For example irq_data could be NULL, but we trust the higher level code
not to do that to us.

Also I don't see any other driver doing this check.

  $ git grep irqd_is_started
  include/linux/irq.h:static inline bool irqd_is_started(struct irq_data *d)
  kernel/irq/chip.c:      if (irqd_is_started(d)) {
  kernel/irq/chip.c:      if (irqd_is_started(&desc->irq_data)) {
  kernel/irq/cpuhotplug.c:        if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !irq_needs_fixup(d)) {


cheers

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ