lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Feb 2021 10:18:11 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Gabriel Krisman Bertazi <krisman@...labora.com>,
        Kyle Huey <me@...ehuey.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...nel.org>,
        open list <linux-kernel@...r.kernel.org>,
        "Robert O'Callahan" <rocallahan@...il.com>
Subject: Re: [PATCH] entry: Fix missed trap after single-step on system call return

On Wed, Feb 3, 2021 at 10:10 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Wed, Feb 3, 2021 at 10:00 AM Gabriel Krisman Bertazi
> <krisman@...labora.com> wrote:
> >
> > Does the patch below follows your suggestion?  I'm setting the
> > SYSCALL_WORK shadowing TIF_SINGLESTEP every time, instead of only when
> > the child is inside a system call.  Is this acceptable?
>
> Looks sane to me.
>
> My main worry would be about "what about the next system call"? It's
> not what Kyle's case cares about, but let me just give an example:
>
>  - task A traces task B, and starts single-stepping. Task B was *not*
> in a system call at this point.
>
>  - task B happily executes one instruction at a time, takes a TF
> fault, everything is good
>
>  - task B now does a system call. That will disable single-stepping
> while in the kernel
>
>  - task B returns from the system call. TF will be set in eflags, but
> the first instruction *after* the system call will execute unless we
> go through the system call exit path
>
> So I think the tracer basically misses one instruction when single-stepping.

I was hoping you wouldn't ask this :)

The x86 architecture is fundamentally a bit busted here.  If we return
from a system call with SYSRET and TF is set in R11, then SYSRET
traps, which means that #DB is delivered before executing a user
instruction.  I have been asking Intel for quite a while to document
this, and they said they did, but I still can't find it.  IRET is the
opposite: if we return from a system call with IRET and TF is set on
the stack, we execute one user instruction and then trap.

So if we want to reliably single-step a system call and trap after the
system call, we just need to synthesize a trap on the way out.  Doing
this and getting all the nasty corners (e.g. sigreturn setting TF,
sigreturn *clearing* TF, signal delivery as part of the syscall,
ptrace mucking with TF) etc right might be nontrivial.

I suspect the behavior back in the bad old asm-entry-path days was at
best inconsistent.

--Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ