lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXvREAXmjS-FkAMTDYDnTsBsrAYGKo32=fgEGJqC8k6Yg@mail.gmail.com>
Date:	Mon, 21 Mar 2016 11:39:07 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Updated version of RD/WR FS/GS BASE patchkit

On Mon, Mar 21, 2016 at 9:16 AM, Andi Kleen <andi@...stfloor.org> wrote:
> This is a reworked version of my older fsgsbase patchkit.
> Main changes:
> - Ported to new entry/* code, which simplified it somewhat
> - Now has a test program
> - Fixed ptrace/core dump support
> - Better documentation
> - Some minor fixes improvement

I think that the biggest remaining issue is to define the semantics.

As an architectural matter, the relevant user state is (fs selector,
fs base, gs selector, gs base).  With FSGSBASE enabled, user code can
more or less independently control all four of those values.  (It's
slightly more complicated than that because set_thread_area and
modify_ldt both forget to reload segment registers IIRC, but we can
fix that independently.)

Keeping in mind that we'll probably want to add percpu segment bases
at some point (to allow very fast atomic percpu data access for user
code), the questions I have are:

1a. What happens when a task switches out and back in on the same CPU?

1b. What happens when a task switches out and back in on a different CPU?

2a. What happens when a tracer reads the state out and writes exactly
the same thing back in and the task resumes on the CPU it started on?

2b. What happens when a tracer reads the state out and writes exactly
the same thing back in and the task resumes on a different CPU?

3. What happens if fs or gs points to a real descriptor and that
descriptor changes?

4. Does the sigcontext format need to change?

For maximum safely, comprehensibility, and sanity, there's an argument
to be made that 1a and 2a should leave the state exactly as it started
and that 1b and 2b should leave it alone unless percpu bases are in
use.  For maximum simplicity of implementation, there's an argument
that, if the fs or gs selector is nonzero and the base doesn't match
the in-memory descriptor, then the kernel can do whatever it wants.

I propose the following semantics:

 - All "save state" or "report state" events unconditionally save the
base and selector as they actually were in the CPU state.  (Keep it
simple.  Also, with these patches applied, on an FSGSBASE-capable CPU,
selector != 0 is a slow path.)

 - When restoring state, if selector == 0, then the base is restored as it was.

 - When restoring state, if selector != 0, then the base is restored
to whatever the in-memory descriptor says.  (Optionally, down the
road, we could make it so that a save + restore without an intervening
migration, set_thread_area, or modify_ldt would restore the base as it
was.  This would make things more predictable.)

 - If/when we add percpu bases, they are associated with a nonzero selector.

The big open question is: should signal delivery and restore do
anything to the selectors or bases?  I think that, by default, it
can't, but maybe we'll want an option to do it some day.

Does all this make sense?  Do people agree with me?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ