lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <AE04B507-C5E2-44D2-9190-41E9BE720F9D@amazon.com>
Date:   Tue, 2 Jun 2020 18:47:54 +0000
From:   "Saidi, Ali" <alisaidi@...zon.com>
To:     "Herrenschmidt, Benjamin" <benh@...zon.com>,
        "maz@...nel.org" <maz@...nel.org>
CC:     "tglx@...utronix.de" <tglx@...utronix.de>,
        "jason@...edaemon.net" <jason@...edaemon.net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "Zilberman, Zeev" <zeev@...zon.com>,
        "Machulsky, Zorik" <zorik@...zon.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Don't try to move a disabled irq


On 5/31/20, 9:40 PM, "Herrenschmidt, Benjamin" <benh@...zon.com> wrote:

    On Sun, 2020-05-31 at 12:09 +0100, Marc Zyngier wrote:
    > 
    > 
    > > Not great indeed. But this is not, as far as I can tell, a GIC
    > > driver problem.
    > > 
    > > The semantic of activate/deactivate (which maps to started/shutdown
    > > in the IRQ code) is that the HW resources for a given interrupt are
    > > only committed when the interrupt is activated. Trying to perform
    > > actions involving the HW on an interrupt that isn't active cannot be
    > > guaranteed to take effect.
    > > 
    > > I'd rather address it in the core code, by preventing set_affinity (and
    > > potentially others) to take place when the interrupt is not in the
    > > STARTED state. Userspace would get an error, which is perfectly
    > > legitimate, and which it already has to deal with it for plenty of
    > > other
    > > reasons.
    
    So I finally found time to dig a bit in there :) Code has changed a bit
    since last I looked. But I have memories of the startup code messing
    around with the affinity, and here it is. In irq_startup() :
    
    
    		switch (__irq_startup_managed(desc, aff, force)) {
    		case IRQ_STARTUP_NORMAL:
    			ret = __irq_startup(desc);
    			irq_setup_affinity(desc);
    			break;
    		case IRQ_STARTUP_MANAGED:
    			irq_do_set_affinity(d, aff, false);
    			ret = __irq_startup(desc);
    			break;
    		case IRQ_STARTUP_ABORT:
    			irqd_set_managed_shutdown(d);
    			return 0;
    
    So we have two cases here. Normal and managed.
    
    In the managed case, we set the affinity before startup. I feel like your
    patch might break that or am I missing something ?
    
    Additionally, your patch would break any userspace program that expects to
    be able to change the affinity on an interrupt before it's been started.
    I don't know if such a thing exsits but the fact that we hit that bug
    makes me think it might.
    
    Now most controller drivers (at least that I'm familiar with, which doesn't
    include GiC at this point) can deal with that just fine.
    
    Now there's also another possible issue:
    
    Your patch checks irqd_is_started(). Now I always mixup irqd vs irq_state these
    days so I may be wrong but irq_state_set_started() is only done in __irq_startup
    which will *not* be called if the interrupt has NOAUTOEN.
    
    Is that ok ? Do we intend for affinity setting not to work until the first
    enable_irq() for such an interrupt ? We could check activated instead of
    started I suppose. (again provided I didn't mixup two different things
    between the irqd and the irq_state stuff).
    
    For these reasons my gut feeling is we should just fix GIC as Ali wanted to
    do initially.
    
    The basic idea is simply to defer the HW configuration until the interrupt
    has been started. I don't see why that would be an issue. Have set_affinity just
    store the mask (and apply whatever other sanity checking it might want to do)
    until the itnerrupt is started and when started, apply things to HW.
    
    I might be missing a reason why it's more complicated than that :) But I do
    feel a bit uncomfortable with your approach.
    
Looks like the x86 apic set_affinity call explicitly checks for if it’s activated in the managed case which makes sense given the code Ben posted above:
          /*
           * Core code can call here for inactive interrupts. For inactive
           * interrupts which use managed or reservation mode there is no
           * point in going through the vector assignment right now as the
           * activation will assign a vector which fits the destination
           * cpumask. Let the core code store the destination mask and be
           * done with it.
           */
          if (!irqd_is_activated(irqd) &&
              (apicd->is_managed || apicd->can_reserve))    

My original patch should certain check activated and not disabled. With that do you still have reservations Marc?

Thanks,
Ali




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ