lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090527072302.GR846@one.firstfloor.org>
Date:	Wed, 27 May 2009 09:23:02 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
Cc:	Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org,
	hpa@...or.com, x86@...nel.org, Huang Ying <ying.huang@...el.com>,
	Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH 02/31] x86: MCE: Improve mce_get_rip v3

On Wed, May 27, 2009 at 01:29:16PM +0900, Hidetoshi Seto wrote:
> Andi Kleen wrote:
> > From: Huang Ying <ying.huang@...el.com>
> > 
> > Assume RIP is valid when either EIPV or RIPV are set.
> 
> Bad description.
> If RIP means "restart IP" that is valid only if RIPV is set,
> this sentence doesn't make sense completely.

No it doesn't mean restart IP, it just means normal instruction
pointer like everywhere else.

> 
> > This influences
> > whether the machine check exception handler decides to return or panic.
> 
> I suppose you are pointing logics in:

Yes.

> 
>                 mce_get_rip(&m, regs);
>  :
>                         panicm = m;
>  :
>                 /*
>                  * If the EIPV bit is set, it means the saved IP is the
>                  * instruction which caused the MCE.
>                  */
>                 if (m.mcgstatus & MCG_STATUS_EIPV)
>                         user_space = panicm.ip && (panicm.cs & 3);
> 
>                 /*
>                  * If we know that the error was in user space, send a
>                  * SIGBUS.  Otherwise, panic if tolerance is low.
>                  *
>                  * force_sig() takes an awful lot of locks and has a slight
>                  * risk of deadlocking.
>                  */
>                 if (user_space) {
>                         force_sig(SIGBUS, current);
>                 } else if (panic_on_oops || tolerant < 2) {
>                         mce_panic("Uncorrected machine check",
>                                 &panicm, mcestart);
>                 }
> 
> So EIPV without RIPV will be no ip and will result in panic,
> while expected result is SIGBUS. 

First this is only for the !MCA recovery case. In the MCA recovery
case we have more information and can decide better.

In this case no EIPV means that the kernel isn't sure where the 
error occurred so it cannot safely decide if it was user space 
or kernel space and in the tolerant == 2 case has to panic
just in case a kernel kill would cause deadlock.

With MCA recovery this whole this is replaced by a new improved
mechanism using the high level handler.

> > 
> > Also in addition do not force the RIP to be valid with the exact
> > register MSRs.
> 
> I think the forced one is EIP:
> > -		m->mcgstatus |= MCG_STATUS_EIPV;

True. Changed.

> 
> And please note that it keep use CS on stack even if MSR is available.
> 
> I made an alternative patch for this, with no functional change.
> Please consider replacing.

No, sorry I got burned too much last time you touched the description
of this simple patch. I think my description is simple and to the point
and this patch doesn't really deserve anything more.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ