[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWpouBd+DqVu594B-94MQH_D0D7sECXZHEoAa+=X-_0=A@mail.gmail.com>
Date: Wed, 3 Feb 2021 10:18:11 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Gabriel Krisman Bertazi <krisman@...labora.com>,
Kyle Huey <me@...ehuey.com>,
Thomas Gleixner <tglx@...utronix.de>,
Andy Lutomirski <luto@...nel.org>,
open list <linux-kernel@...r.kernel.org>,
"Robert O'Callahan" <rocallahan@...il.com>
Subject: Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 10:10 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Wed, Feb 3, 2021 at 10:00 AM Gabriel Krisman Bertazi
> <krisman@...labora.com> wrote:
> >
> > Does the patch below follows your suggestion? I'm setting the
> > SYSCALL_WORK shadowing TIF_SINGLESTEP every time, instead of only when
> > the child is inside a system call. Is this acceptable?
>
> Looks sane to me.
>
> My main worry would be about "what about the next system call"? It's
> not what Kyle's case cares about, but let me just give an example:
>
> - task A traces task B, and starts single-stepping. Task B was *not*
> in a system call at this point.
>
> - task B happily executes one instruction at a time, takes a TF
> fault, everything is good
>
> - task B now does a system call. That will disable single-stepping
> while in the kernel
>
> - task B returns from the system call. TF will be set in eflags, but
> the first instruction *after* the system call will execute unless we
> go through the system call exit path
>
> So I think the tracer basically misses one instruction when single-stepping.
I was hoping you wouldn't ask this :)
The x86 architecture is fundamentally a bit busted here. If we return
from a system call with SYSRET and TF is set in R11, then SYSRET
traps, which means that #DB is delivered before executing a user
instruction. I have been asking Intel for quite a while to document
this, and they said they did, but I still can't find it. IRET is the
opposite: if we return from a system call with IRET and TF is set on
the stack, we execute one user instruction and then trap.
So if we want to reliably single-step a system call and trap after the
system call, we just need to synthesize a trap on the way out. Doing
this and getting all the nasty corners (e.g. sigreturn setting TF,
sigreturn *clearing* TF, signal delivery as part of the syscall,
ptrace mucking with TF) etc right might be nontrivial.
I suspect the behavior back in the bad old asm-entry-path days was at
best inconsistent.
--Andy
Powered by blists - more mailing lists