lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190117122253.GC5023@zn.tnic>
Date:   Thu, 17 Jan 2019 13:22:53 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Andy Lutomirski <luto@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        kvm@...r.kernel.org, "Jason A. Donenfeld" <Jason@...c4.com>,
        Rik van Riel <riel@...riel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH 05/22] x86/fpu: Remove fpu->initialized usage in
 copy_fpstate_to_sigframe()

On Wed, Jan 16, 2019 at 11:40:37PM +0100, Sebastian Andrzej Siewior wrote:
> Actually we do. copy_fpregs_to_sigframe() saves current FPU registers to
> task's stack frame which is userspace memory.

I know we do - I was only pointing at the not optimal choice of words -
"save registers to userspace" and to rather say "save hardware registers
to user buffers" or so.

> I think *parts* of the ->initialized field was wrongly converted while
> lazy-FPU was removed *or* it was forgotten to be removed afterwards. Or
> I don't know but it looks like a leftover.
> 
> At the beginning (while it was added) it was part of the lazy-FPU code.
> So if tasks's FPU register are not active then they are saved in task's
> FPU struct. So in this case (the else path) it does
> 	__copy_to_user(buf_fx, xsave, fpu_user_xstate_size)

So far, so good. Comment above says so too:

 * If the fpu, extended register state is live, save the state directly
 * to the user frame pointed by the aligned pointer 'buf_fx'. Otherwise,
 * copy the thread's fpu state to the user frame starting at 'buf_fx'.

> In the other case (task's FPU struct is not up-to date, the current
> FPU register content is in CPU's registers) it does
> 	copy_fpregs_to_sigframe(buf_fx)

ACK.

> How does using_compacted_format() fit in here?
> The point is that the "compacted" format is never exposed to
> userland so it requires normal xsave. So far so good, right? But how
> does it work in in the '->initialized = 0' case right?  It was
> introduced in commit
>   99aa22d0d8f7 ("x86/fpu/xstate: Copy xstate registers directly to the signal frame when compacted format is in use")
> 
> and it probably does not explain why this works, right?

I think this was imposed by our inability to handle XSAVES compacted
format. And that should be fixed now, AFAICR.

> So *either* fpregs_active() was always true if the task used FPU *once*
> or if it used FPU *recently* and task's FPU register are active (I don't
> remember anymore). Anyway:
> a) we don't get here because caller checks for fpregs_active() before
>    invoking copy_fpstate_to_sigframe()

Ok.

> b) a preemption check resets fpregs_active() after the first check
>    then we do "xsave", xsaves traps because FPU is off/disabled, trap
>    loads task's FPU registers, gets back to "xsave", "xsave" saves
>    CPU's register to the stack frame.
> 
> The b part does not work like that since commit
>   bef8b6da9522 ("x86/fpu: Handle #NM without FPU emulation as an error")
> 
> but then at that point it was "okay" because fpregs_active() would
> return true if the task used FPU registers at least once. If it did not
> use them then it would not invoke that function (the caller checks for
> fpregs_active()).

Right, AFAICT, we were moving to eager FPU at that time and this commit
is part of the lazy FPU removal stuff.

> So I can't tell you why it is okay but I can explain why it is done
> (well, that part I puzzled together).

I hate the fact that we have to puzzle stuff together for the FPU code.
;-\

> The task is running and using FPU registers. Then an evil mind sends a
> signal. The task goes into kernel, prepares itself and is about to
> handle the signal in userland. It saves its FPU registers on the stack
> frame. It zeros its current FPU registers (ready for a fresh start),
> loads the address of the signal handler and returns to user land
> handling the signal.
> 
> Now. The signal handler may use FPU registers and the signal handler
> maybe be preempted so you need to save the FPU registers of the signal
> handler and you can't mix them up with the FPU register's of the task
> (before it started handling the signal).
> 
> So in order to avoid a second FPU struct it saves them on user's stack
> frame. I *think* this (avoiding a second FPU struct) is the primary
> motivation.

Yah, makes sense. Sounds like something we'd do :-)

> A bonus point might be that the signal handler has a third
> argument the `context'. That means you can use can access the task's FPU
> registers from the signal handler. Not sure *why* you want to do so but
> yo can.

For <raisins>.

> I can't imagine a use case and I was looking for a user and expecting it
> to be glibc but I didn't find anything in the glibc that would explain
> it. Intel even defines a few bytes as "user reserved" which are used by
> "struct _fpx_sw_bytes" to add a marker in the signal and recognise it on
> restore.
> The only user that seems to make use of that is `criu' (or it looked
> like it does use it). I would prefer to add a second struct-FPU and use
> that for the signal handler. This would avoid the whole dance here.

That would be interesting from the perspective of making the code
straight-forward and not having to document all that dance somewhere.

> And `criu' could maybe become a proper interface. I don't think as of
> now that it will break something in userland if the signal handler
> suddenly does not have a pointer to the FPU struct.

Well, but allocating a special FPU pointer for the signal handler
context sounds simple and clean, no? Or are we afraid that that would
slowdown signal handling, the whole allocation and assignment and
stuff...?

> Okay. So I was verbose *now*. Depending on what you say (or don't) I
> will try to recycle this into commit message in a few days.

Yeah, much much better. Thanks a lot for the effort!

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ