linux-kernel - Re: [RFC][PATCH] irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100624160937.GQ578@basil.fritz.box>
Date:	Thu, 24 Jun 2010 18:09:37 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Borislav Petkov <bp@...64.org>
Cc:	Andi Kleen <andi@...stfloor.org>, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	Huang Ying <ying.huang@...el.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Borislav Petkov <petkovbb@...glemail.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mauro@...e.hu" <mauro@...e.hu>
Subject: Re: [RFC][PATCH] irq_work

On Thu, Jun 24, 2010 at 05:41:24PM +0200, Borislav Petkov wrote:
> > If you don't do something
> > (like killing or recovery) you could end up in a loop or consume
> > corrupted data or something else bad. 
> > 
> > So the error has to have a fail safe path from detection to handling.
> 
> So we are talking about a more involved and "could-sleep" error
> recovery.

That's one case, there are other too.

> 
> > That's quite different from logging or performance counting etc.
> > where dropping events on overload is normal and expected.
> 
> So I went back and reread the whole thread, and correct me if I'm
> wrong but the whole run softirq after NMI has one use case for now -
> "could-sleep" error handling for MCEs _only_ on x86. So you're changing

Nope, there are multiple use cases. Today it's background MCE
and possibly perf if it ever decides to share code
with the rest of the kernel instead of wanting to be Bork of Linux. 
Future ones would be more MCE errors and also non MCE errors like NMIs.

> a bunch of generic and x86 kernel code just for error handling. Hmm,
> that's a kinda big hammer in my book.

Actually no, it would just make the current code slightly cleaner
and somewhat more general.  But for most cases it works without it.
> 
> A slimmer solution is a much better way to go, IMHO. I think Peter said
> something about irq_exit(), which should be just fine.

The "slimmer solution" is there, but it has some limitations.
I merely said that softirqs would be useful for solving these limitations
(but are not strictly needed)

Anyways slimmer solution was even originally proposed, 
just some of the earlier review proposed softirqs instead.
So Ying posts softirqs and then he gets now flamed for posting
softirqs.  Overall there wasn't much consistency in the suggestions,
three different reviewers suggested three incompatible approaches.

Anyways if there are no softirqs that's fine too, the error
handler can probably live with not having that.

> But AFAICT an arch-specific solution would be even better, e.g.
> if you call into your deferred work helper from paranoid_exit in
> <arch/x86/kernel/entry_64.S>. I.e, something like

Yes that helps for part of the error handling (in fact this
has been implemented), but that does not solve the self interrupt
problem which requires delaying until next cli.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/