lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 17 Dec 2015 00:23:16 -0700
From:	Jeff Merkey <linux.mdb@...il.com>
To:	LKML <linux-kernel@...r.kernel.org>
Cc:	tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
	x86@...nel.org, peterz@...radead.org, luto@...nel.org
Subject: Re: 4.4-rc5 Setting trap flag inside nmi handler results in HARD LOCKUP

On 12/16/15, Jeff Merkey <linux.mdb@...il.com> wrote:
> Setting the (trap flag | resume flag) inside of an nmi handler results
> in a hard lockup while setting the resume flag works fine.
>
> The watchdog detector fails to detect the lockup.  I am currently
> examining the trap gate and interrupt gate setup on Linux and if
> anyone has any ideas it would be nice to be able to debug and step
> through the nmi handlers.  I got breakpoints to work.  I noticed
> kgdb/kdb just punts here and refuses to allow someone to step inside
> an nmi handler.
>
> There is no reason Linux should not allow this to work since windows
> does and every other OS out there.  I have seen this across some rex64
> sysret calls as well this lockup behavior.
>
> Anyone who is an intel expert with any clues would love some input if
> you know about this problem.
>
> Jeff
>

This bug has been located.  Results from returning from NMI interrupt
with trap flag set in to a userspace address as Andy suspected but its
not due to the RSP value being different as he suggested.   This is a
separate bug from the rex64 sysret bug.

Results in the NMI handler switching IDT entries if an NMI fires off
in a debug stack.  Ironic since the code claims it is switching stacks
to enable debugging of NMI handlers and does the opposite -- breaks
them.  Commenting out this code gets rid of the hard lockup.  The user
space process that gets the trap flag and doesn't expect a trap flag
just hangs (but the just that process the rest of the system keeps
running).

So a few bugs to run down still.  NMI handlers can now be debugged -- kindof.

This bug is closed and I will issue a patch for it.  It's a condition
where a trap flag is set inside an nmi handler that exits to a
userspace address.   The code for setting and clearing the trap in
kernel all worked correctly for the userspace path, except it put the
process to sleep when it shouldn't have.  It's not a condition that
can happen during normal operations unless you set the trap flag from
a debugger inside an NMI handler and try to debug it then exit the
handler into userspace, so I think the probability of this showing up
outside a debugging session is low.

I verified that kgdb/kdb also experiences this bug (If I comment out
the code blocking folks from debugging NMI handlers with kgdb/kdb).

Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ