linux-kernel - Re: [RFC PATCH 4/6] uprobes/x86: Emulate rip-relative call's

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1396995963.5056.46.camel@oc7886638347.ibm.com.usor.ibm.com>
Date:	Tue, 08 Apr 2014 15:26:03 -0700
From:	Jim Keniston <jkenisto@...ux.vnet.ibm.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anton Arapov <aarapov@...hat.com>,
	David Long <dave.long@...aro.org>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Jonathan Lebon <jlebon@...hat.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 4/6] uprobes/x86: Emulate rip-relative call's

On Sun, 2014-04-06 at 22:16 +0200, Oleg Nesterov wrote:
> 0xe8. Anything else?

No, I think e8 is the only call instruction uprobes will see.

> 

The following couple of paragraphs should be included in the code,
perhaps merged with some of the related comments already there.  Without
it, the code is pretty hard to follow.

> Emulating of rip-relative call is trivial, we only need to additionally
> push the ret-address. If this fails, we execute this instruction out of
> line and this should trigger the trap, the probed application should die
> or the same insn will be restarted if a signal handler expands the stack.
> We do not even need ->post_xol() for this case.
> 
> But there is a corner (and almost theoretical) case: another thread can
> expand the stack right before we execute this insn out of line. In this
> case it hit the same problem we are trying to solve. So we simply turn
> the probed insn into "call 1f; 1:" and add ->post_xol() which restores
> ->sp and restarts.

Can you explain under what circumstances emulation of the call
instruction would fail?  Will the copy_to_user() call that writes the
return address to the stack grow the stack if necessary?  Will
single-stepping the mangled call instruction expand the stack?

> 
> Many thanks to Jonathan who finally found the standalone reproducer,
...
> Reported-by: Jonathan Lebon <jlebon@...hat.com>
> Signed-off-by: Oleg Nesterov <oleg@...hat.com>
> ---
>  arch/x86/include/asm/uprobes.h |    1 +
>  arch/x86/kernel/uprobes.c      |   69 ++++++++++++++++++++++++++++++++++------
>  2 files changed, 60 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
> index cca62c5..9528117 100644
> --- a/arch/x86/include/asm/uprobes.h
> +++ b/arch/x86/include/asm/uprobes.h
> @@ -51,6 +51,7 @@ struct arch_uprobe {
>  		struct {
>  			s32	disp;
>  			u8	ilen;
> +			u8	opc1;
>  		}				ttt;
>  	};
>  };
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 3bcc121..9283024 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -461,33 +461,85 @@ static struct uprobe_xol_ops default_xol_ops = {
>  	.post_xol = default_post_xol_op,
>  };
> 
> +static bool ttt_is_call(struct arch_uprobe *auprobe)
> +{
> +	return auprobe->ttt.opc1 == 0xe8;
> +}
> +
>  static bool ttt_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
>  {
> -	regs->ip += auprobe->ttt.ilen + auprobe->ttt.disp;
> +	unsigned long new_ip = regs->ip += auprobe->ttt.ilen;
> +
> +	if (ttt_is_call(auprobe)) {
> +		unsigned long new_sp = regs->sp - sizeof_long();
> +		if (copy_to_user((void __user *)new_sp, &new_ip, sizeof_long()))
> +			return false;
> +		regs->sp = new_sp;
> +	}
> +
> +	regs->ip = new_ip + auprobe->ttt.disp;
>  	return true;
>  }
> 
> +static int ttt_post_xol_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
> +{
> +	BUG_ON(!ttt_is_call(auprobe));
> +	/*
> +	 * We can only get here if ttt_emulate_op() failed to push the return
> +	 * address _and_ another thread expanded our stack before the (mangled)
> +	 * "call" insn was executed out-of-line. Just restore ->sp and restart.
> +	 * We could also restore ->ip and try to call ttt_emulate_op() again.
> +	 */

As I understand it, this comment assumes that the single-stepped call
instruction can't grow the stack on its own, and will instead die with a
SEGV or some such.  As noted above, I don't know if that's correct.  (If
the single-stepped call doesn't die and doesn't grow the stack -- I'm
not sure how that would happen -- then it seems like we have an infinite
loop.)

> +	regs->sp += sizeof_long();
> +	return -ERESTART;
> +}
> +
> +static void ttt_clear_displacement(struct arch_uprobe *auprobe, struct insn *insn)
> +{
> +	/*
> +	 * Turn this insn into "call 1f; 1:", this is what we will execute
> +	 * out-of-line if ->emulate() fails.

Add comment: In the unlikely event that this mangled instruction is
successfully single-stepped, we'll reset regs->ip to the beginning of
the instruction so the probepoint is hit again and the original
instruction is emulated (successfully this time, we assume).

> +	 *
> +	 * In the likely case this will lead to arch_uprobe_abort_xol(), but
> +	 * see the comment in ->emulate(). So we need to ensure that the new
> +	 * ->ip can't fall into non-canonical area and trigger #GP.
> +	 *
> +	 * We could turn it into (say) "pushf", but then we would need to
> +	 * divorce ->insn[] and ->ixol[]. We need to preserve the 1st byte
> +	 * of ->insn[] for set_orig_insn().
> +	 */
> +	memset(auprobe->insn + insn_offset_displacement(insn),
> +		0, insn->moffset1.nbytes);

s/displacement/immediate/
s/moffset1/immediate/
Both already fixed in today's patch.

> +}
> +
>  static struct uprobe_xol_ops ttt_xol_ops = {
>  	.emulate  = ttt_emulate_op,
> +	.post_xol = ttt_post_xol_op,
>  };
> 
>  static int ttt_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
>  {
> +	u8 opc1 = OPCODE1(insn);
> +
> +	/* has the side-effect of processing the entire instruction */
> +	insn_get_length(insn);
> +	if (WARN_ON_ONCE(!insn_complete(insn)))
> +		return -ENOEXEC;
> 
> -	switch (OPCODE1(insn)) {
> +	switch (opc1) {
>  	case 0xeb:	/* jmp 8 */
>  	case 0xe9:	/* jmp 32 */
>  	case 0x90:	/* prefix* + nop; same as jmp with .disp = 0 */
>  		break;
> +
> +	case 0xe8:	/* call relative */
> +		ttt_clear_displacement(auprobe, insn);
> +		auprobe->ttt.opc1 = opc1;

You set ttt.opc1 for call, and later for conditional jumps, but not for
nop and unconditional jumps.  Might confuse a maintainer down the road?

> +		break;
>  	default:
>  		return -ENOSYS;
>  	}
> 
> -	/* has the side-effect of processing the entire instruction */
> -	insn_get_length(insn);
> -	if (WARN_ON_ONCE(!insn_complete(insn)))
> -		return -ENOEXEC;
> -
>  	auprobe->ttt.ilen = insn->length;
>  	auprobe->ttt.disp = insn->moffset1.value;
>  	/* so far we assume that it fits into ->moffset1 */
> @@ -534,9 +586,6 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	case 0xca:
>  		fix_ip = false;
>  		break;
> -	case 0xe8:		/* call relative - Fix return addr */
> -		fix_call = true;
> -		break;
>  	case 0x9a:		/* call absolute - Fix return addr, not ip */
>  		fix_call = true;
>  		fix_ip = false;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/