lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 15 Dec 2011 21:26:46 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Will Deacon <will.deacon@....com>
Cc:	"tglx\@linutronix.de" <tglx@...utronix.de>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: IRQ migration on CPU offline path

Will Deacon <will.deacon@....com> writes:

> Hi Eric,
>
> Cheers for the response.
>
> On Thu, Dec 15, 2011 at 01:25:19AM +0000, Eric W. Biederman wrote:
>> Will Deacon <will.deacon@....com> writes:
>> > I've been looking at the IRQ migration code on x86 (fixup_irqs) for the CPU
>> > hotplug path in order to try and fix a bug we have on ARM with the
>> > desc->affinity mask not getting updated. Compared to irq_set_affinity, the code
>> > is pretty whacky (I guess because it's called via stop_machine) so I wondered if
>> > you could help me understand a few points:
>> 
>> There is a lot of craziness on that path because of poor hardware design
>> on x86 we can't know when an irq has actually be migrated, and other
>> nasties.
>> 
>> There is also the issue that I expect is still the case that we have the
>> generic layer asking us to cpu migration and the associated irq
>> migrations with the irqs disabled which at least for the bits of poorly
>> designed hardware made the entire path a best effort beast.
>
> Argh, ok. Does this mean that other architectures should just preserve the
> interface that x86 gives (for example not triggering IRQ affinity
> notifiers)?

Interesting.  In this case the affinity notifier is an ugly hack for
exactly one driver.  The affinity notifier is new (This January) and
buggy.  Among other things there appears to be a clear reference count
leak on the affinity notify structure. 

Honestly I don't see much to justify the existence of the affinity
notifiers, and especially their requirement that they be called in
process context.

At a practical level since the architects of the affinity notifier
didn't choose to add notification on migration I don't see why
you should care.

This isn't an x86 versus the rest of the world.  This is a 
Solarflare driver vs the rest of the kernel issue.  When the Solarflar
developers care they can fix up arm and all of the rest of the
architectures that support cpu hot unplug.

>> If x86 becomes a good clean example in this corner case I would be
>> amazed.  Last I looked I almost marked it all as CONFIG_BROKEN because
>> we were trying to do the impossible.  Unfortunately peoples laptops
>> go through this path when they suspend and so it was more painful to
>> disable hacky racy mess than to keep living with it.
>> 
>> There has been an increase in the number of cases where it is possible
>> to actually perform the migration with irqs disabled so on a good day
>> that code might even work.  
>
> Right, so this stuff is fairly fragile. We can probably get a reasonable
> version working on ARM (with the GIC present) but I'm not sure what to do
> about the notifiers I mentioned earlier and proper migration of threaded
> interrupt handlers.

Yes. It looks like the irq migration notifiers are just broken by design.

As for threaded interrupt handlers there is probably something
reasonable that can be done there.  My guess is threaded interrupt
handlers should be handled the same way any other thread is handled
during cpu hot-unplug.  And if something needs to be done I expect the
generic code can do it.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ