lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m13abr2w0a.fsf@fess.ebiederm.org>
Date:	Wed, 29 Apr 2009 10:46:29 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Gary Hade <garyhade@...ibm.com>
Cc:	Yinghai Lu <yhlu.kernel@...il.com>, mingo@...e.hu,
	mingo@...hat.com, tglx@...utronix.de, hpa@...or.com,
	x86@...nel.org, linux-kernel@...r.kernel.org, lcm@...ibm.com
Subject: Re: [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption

Gary Hade <garyhade@...ibm.com> writes:

>> > This didn't help.  Using 2.6.30-rc3 plus your patch both bugs
>> > are unfortunately still present.
>> 
>> You could offline the cpus?  I know when I tested it on my
>> laptop I could not offline the cpus.
>
> Eric, I'm sorry!  This was due to my stupid mistake.  When I
> went to apply your patch I included --dry-run to test it but
> apparently got distracted and never actually ran patch(1)
> without --dry-run. <SIGH>
>
> So, I just rebuilt after _really_ applying the patch and got
> the following result which probably to be what you intended.

Ok.  Good to see.

>> >> I propose detecting thpe cases that we know are safe to migrate in
>> >> process context, aka logical deliver with less than 8 cpus aka "flat"
>> >> routing mode and modifying the code so that those work in process
>> >> context and simply deny cpu hotplug in all of the rest of the cases.
>> >
>> > Humm, are you suggesting that CPU offlining/onlining would not
>> > be possible at all on systems with >8 logical CPUs (i.e. most
>> > of our systems) or would this just force users to separately
>> > migrate IRQ affinities away from a CPU (e.g. by shutting down
>> > the irqbalance daemon and writing to /proc/irq/<irq>/smp_affinity)
>> > before attempting to offline it?
>> 
>> A separate migration, for those hard to handle irqs.
>> 
>> The newest systems have iommus that irqs go through or are using MSIs
>> for the important irqs, and as such can be migrated in process
>> context.  So this is not a restriction for future systems.
>
> I understand your concerns but we need a solution for the
> earlier systems that does NOT remove or cripple the existing
> CPU hotplug functionality.  If you can come up with a way to
> retain CPU hotplug function while doing all IRQ migration in
> interrupt context I would certainly be willing to try to find
> some time to help test and debug your changes on our systems.

Well that is ultimately what I am looking towards.

How do we move to a system that works by design, instead of
one with design goals that are completely conflicting.

Thinking about it, we should be able to preemptively migrate
irqs in the hook I am using that denies cpu hotplug.

If they don't migrate after a short while I expect we should
still fail but that would relieve some of the pain, and certainly
prevent a non-working system.

There are little bits we can tweak like special casing irqs that
no-one is using.

My preference here is that I would rather deny cpu hotplug unplug than
have the non-working system problems that you have seen.

All of that said I have some questions about your hardware.
- How many sockets and how many cores do you have?
- How many irqs do you have?
- Do you have an iommu that irqs can go through?

If you have <= 8 cores this problem is totally solvable.

Other cases may be but I don't know what the tradeoffs are.
For very large systems we don't have enough irqs without
limiting running in physical flat mode which makes things
even more of a challenge.

It may also be that your ioapics don't have the bugs that
intel and amd ioapics have and we could have a way to recognize
high quality ioapics.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ