lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXDbRhXZcMM7ao___ExZ_9Hyh5cDorbvT1r8dfZor+9bA@mail.gmail.com>
Date:	Thu, 16 Jul 2015 18:53:15 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Sasha Levin <sasha.levin@...cle.com>,
	Frédéric Weisbecker <fweisbec@...il.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>, X86 ML <x86@...nel.org>,
	Rik van Riel <riel@...hat.com>
Subject: Reconciling rcu_irq_enter()/rcu_nmi_enter() with context tracking

For reasons that mystify me a bit, we currently track context tracking
state separately from rcu's watching state.  This results in strange
artifacts: nothing generic cause IRQs to enter CONTEXT_KERNEL, and we
can nest exceptions inside the IRQ handler (an example would be
wrmsr_safe failing), and, in -next, we splat a warning:

https://gist.github.com/sashalevin/a006a44989312f6835e7

I'm trying to make context tracking more exact, which will fix this
issue (the particular splat that Sasha hit shouldn't be possible when
I'm done), but I think it would be nice to unify all of this stuff.
Would it be plausible for us to guarantee that RCU state is always in
sync with context tracking state?  If so, we could maybe simplify
things and have fewer state variables.

Doing this for NMIs might be weird.  Would it make sense to have a
CONTEXT_NMI that's somehow valid even if the NMI happened while
changing context tracking state.

Thoughts?  As it stands, I think we might already be broken for real:

Syscall -> user_exit.  Perf NMI hits *during* user_exit.  Perf does
copy_from_user_nmi, which can fault, causing do_page_fault to get
called, which calls exception_enter(), which can't be a good thing.

RCU is okay (sort of) because of rcu_nmi_enter, but this seems very fragile.

Thoughts?  As it stands, I need to do something because -tip and thus
-next spews occasional warnings.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ