lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100624140143.GO578@basil.fritz.box>
Date:	Thu, 24 Jun 2010 16:01:43 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andi Kleen <andi@...stfloor.org>, Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Huang Ying <ying.huang@...el.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Borislav Petkov <petkovbb@...glemail.com>,
	linux-kernel@...r.kernel.org, mauro@...e.hu
Subject: Re: [RFC][PATCH] irq_work

> Please, as Peter and Boris asked you already, quote a concrete, specific 
> example:

It was already in my answer to Peter.

> 
>   'Specific event X occurs, kernel wants/needs to do Y. This cannot be done
>    via the suggested method due to Z.'
> 
> Your generic arguments look wrong (to the extent they are specified) and it 
> makes it much easier and faster to address your points if you dont blur them 
> by vagaries.

It's one of the fundamental properties of recoverable errors.

Error happens.
Machine check or NMI or other exception happens. 
	That exception runs on the exception stack
	The error is not fatal, but recoverable.
For example you want to kill a process or call hwpoison or do some other
	recovery action. These generally have to sleep to do anything
	interesting.
You cannot do the sleeping on the exception stack, so you push it to
another context.

Now just because an error is recoverable doesn't mean it's not critical
(I think that was the mistake Boris made). If you don't do something
(like killing or recovery) you could end up in a loop or consume
corrupted data or something else bad. 

So the error has to have a fail safe path from detection to handling.

That's quite different from logging or performance counting etc.
where dropping events on overload is normal and expected.

Normally it can be only done by using dedicated resources.

-Andi

-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ