linux-kernel - Re: Reworking suspend-resume sequence (was: Re: PCI PM: Restore standard config registers of all devices early)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0902031209280.3247@localhost.localdomain>
Date:	Tue, 3 Feb 2009 12:18:36 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Thomas Gleixner <tglx@...utronix.de>,
	Jesse Barnes <jesse.barnes@...el.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andreas Schwab <schwab@...e.de>, Len Brown <lenb@...nel.org>
Subject: Re: Reworking suspend-resume sequence (was: Re: PCI PM: Restore
 standard config registers of all devices early)

On Tue, 3 Feb 2009, Ingo Molnar wrote:
> 
>  - the screaming-irq observation i had - do you consider that valid?:
> 
>    >> [ In theory this also solves screaming level-triggered irqs that 
>    >>   advertise themselves as edge-triggered [due to firmware/BIOS bug - 
>    >>   these do happen] and then keep spamming the system. ]
> 
>    I wanted to have a pretty much interchangeable flow method between edge 
>    and level triggered - so that the BIOS cannot screw us by enumerating an 
>    irq as edge-triggered while it's level-triggered.

Yes, if we can't be 100% sure it's really edge-triggered, I guess the mask 
thing is really worth it. So maybe "handle_edge_irq()" is actually doing 
everything right.

Of course, with MSI, we can fundamentally really be sure that it's 
edge-triggered (since it's literally a packet on the PCI bus that 
generates it), and that actually brings up another possibility: assuming 
handle_edge_irq() is doing the correct "safe" thing, maybe the answer is 
to just get rid of the MSI "mask()" operation as being unnecessary, and 
catch it at that level.

NOTE! From a correctness standpoint I think this is all irrelevant. Even 
if we have turned off the power of some device, the msi irq masking isn't 
going to hurt (apart from _possibly_ causing a machine check, but that's 
nothing new - architectures that enable machine checks on accesses to 
non-responding PCI hardware have to handle those anyway).

So I wouldn't worry too much. I think this is interesting mostly from a 
performance standpoint - MSI interrupts are supposed to be fast, and under 
heavy interrupt load I could easily see something like

 - cpu1: handles interrupt, has acked it, calls down to the handler

 - the handler clears the original irq source, but another packet (or disk 
   completion) happens almost immediately

 - cpu2 takes the second interrupt, but it's still IRQ_INPROGRESS, so it 
   masks.

 - cpu1 gets back and unmasks etc and now really handles it because of 
   IRQ_PENDING.

Note how the mask/unmask were all just costly extra overhead over the PCI 
bus. If we're talking something like high-performance 10Gbit ethernet (or 
even maybe fast SSD disks), driver writers actually do count PCI cycles, 
because a single PCI read can be several hundred ns, and if you take a 
thousand interrupts per second, it does add up.

Of course, ethernet tends to do things like interrupt mitigation to avoid 
this, but that has its own downsides (longer latencies) and isn't really 
considered optimal in some RT environments (wall street trading kind of 
things).

I really don't know how big an issue this all is. It probably isn't really 
noticeable.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/