lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVqA0JVxZeD+7vMD-9xmF8udXw5RO0fwQA2X45DSVqM0Q@mail.gmail.com>
Date:	Tue, 18 Aug 2015 16:07:51 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
	Sasha Levin <sasha.levin@...cle.com>,
	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Borislav Petkov <bp@...en8.de>, Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH] x86/entry/64: Context-track syscalls before enabling interrupts

On Tue, Aug 18, 2015 at 4:02 PM, Frederic Weisbecker <fweisbec@...il.com> wrote:
> On Tue, Aug 18, 2015 at 03:35:30PM -0700, Andy Lutomirski wrote:
>> On Tue, Aug 18, 2015 at 3:16 PM, Frederic Weisbecker <fweisbec@...il.com> wrote:
>> > On Tue, Aug 18, 2015 at 12:11:59PM -0700, Andy Lutomirski wrote:
>> >> This fixes a couple minor holes if we took an IRQ very early in syscall
>> >> processing:
>> >>
>> >>  - We could enter the IRQ with CONTEXT_USER.  Everything worked (RCU
>> >>    was fine), but we could warn if all the debugging options were
>> >>    set.
>> >
>> > So this is fixing issues after your changes that call user_exit() from
>> > IRQs, right?
>>
>> Yes.  Here's an example splat, courtesy of Sasha:
>>
>> https://gist.github.com/sashalevin/a006a44989312f6835e7
>>
>> >
>> > But the IRQs aren't supposed to call user_exit(), they have their own hooks.
>> > That's where the real issue is.
>>
>> In -tip, the assumption is that we *always* switch to CONTEXT_KERNEL
>> when entering the kernel for a non-NMI reason.
>
> Why? IRQs don't need that! We already have irq_enter()/irq_exit().
>

Those are certainly redundant.  I want to have a real hook to call
that says "switch to IRQ context from CONTEXT_USER" or "switch to IRQ
context from CONTEXT_KERNEL" (aka noop), but that doesn't currently
exist.

> And we don't want to call rcu_user_*() pairs on IRQs, you're
> introducing a serious performance regression here! And I'm talking about
> the code that's currently in -tip.

Is there an easy way to fix it?  For example, could we figure out what
makes it take so long and make it faster?  If we need to, we could
back out the IRQ bit and change the assertions for 4.3, but I'd rather
keep the exact context tracking if at all possible.

>
>> That means that we can
>> avoid all of the (expensive!) checks for what context we're in.
>
> If you're referring to context tracking, the context check is a per-cpu
> read. Not something that's usually considered expensive.

In -tip, there aren't even extra branches, except those imposed by the
user_exit implementation.

>
>> It also means that (other than IRQs, which need further cleanup), we only
>> switch once per user/kernel switch.
>
> ???

In 4.2 and before, we can switch multiple times on the way out of the
kernel, via SCHEDULE_USER, do_notify_resume, etc.  In -tip, we do it
exactly once no matter what.

>
>>
>> The cost for doing should be essentially zero, modulo artifacts from
>> poor inlining.
>
> And modulo rcu_user_*() that do multiple costly atomic_add_return() operations
> implying full memory barriers. Plus the unnecessary vtime accounting that doubles
> the existing one in irq_enter/exit() (those even imply a lock currently, which will
> probably be turned to seqcount, but still, full memory barriers...).
>
> I'm sorry but I'm going to NACK any code that does that in IRQs (and again that
> concerns current tip:x86/asm).

Why do we need these heavyweight barriers?

If there's actually a measurable performance hit in IRQs in -tip, then
can we come up with a better fix?  For example, we could change all
the new CT_WARN_ON calls to check "are we in CONTEXT_KERNEL or in IRQ
context" and make the IRQ entry do a lighter weight context tracking
operation.

But I think I'm still missing something fundamental about the
performance: why is irq_enter() any faster than user_exit()?

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ