linux-kernel - Re: [PATCHv2] x86 trace: Fix page fault tracing bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140228111537.16e1635e@gandalf.local.home>
Date:	Fri, 28 Feb 2014 11:15:37 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Jiri Olsa <jolsa@...hat.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org, Paul Mackerras <paulus@...ba.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	"H. Peter Anvin" <hpa@...or.com>,
	Seiji Aguchi <seiji.aguchi@....com>,
	Vince Weaver <vincent.weaver@...ne.edu>
Subject: Re: [PATCHv2] x86 trace: Fix page fault tracing bug

Vince, you probably missed my other emails, as I sent from mutt, and
not my normal email client. I see that mutt uses my own box to send,
and I got these error messages (need to put back sending via my ISP
instead of my local mail server):

>>> vincent.weaver@...ne.edu (after MAIL FROM): 550 5.5.4 <rostedt@...e.goodmis.org>... Domain of sender address rostedt@...e.goodmis.org refused. MX points to hosts without valid addresses. This violates RFC 1035/2181.  

-- Steve


On Fri, 28 Feb 2014 17:05:26 +0100
Jiri Olsa <jolsa@...hat.com> wrote:

> On Fri, Feb 28, 2014 at 04:47:08PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 28, 2014 at 04:33:40PM +0100, Jiri Olsa wrote:
> > 
> > While I like the idea of just pushing up the CR2 read; the below does
> > the read too late still, exception_enter() also has a tracepoint in.
> 
> please check v2, thanks
> 
> jirka
> 
> 
> ---
> The trace_do_page_fault function trigger tracepoint
> and then handles the actual page fault.
> 
> This could lead to error if the tracepoint caused page
> fault. The original cr2 value gets lost and the original
> page fault handler kills current process with SIGSEGV.
> 
> This happens if you record page faults with callchain
> data, the user part of it will cause tracepoint handler
> to page fault:
> 
>   # perf record -g -e exceptions:page_fault_user ls
> 
> Fixing this by saving the original cr2 value
> and using it after tracepoint handler is done.
> 
> v2: Moving the cr2 read before exception_enter, because
>     it could trigger tracepoint as well.
> 
> Reported-by: Arnaldo Carvalho de Melo <acme@...stprotocols.net>
> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Cc: Paul Mackerras <paulus@...ba.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Arnaldo Carvalho de Melo <acme@...stprotocols.net>
> Cc: H. Peter Anvin <hpa@...or.com>
> Cc: Seiji Aguchi <seiji.aguchi@....com>
> Cc: Vince Weaver <vincent.weaver@...ne.edu>
> Cc: Steven Rostedt <rostedt@...dmis.org>
> ---
>  arch/x86/mm/fault.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 9d591c8..dd59031 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1016,11 +1016,11 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
>   * routines.
>   */
>  static void __kprobes
> -__do_page_fault(struct pt_regs *regs, unsigned long error_code)
> +__do_page_fault(struct pt_regs *regs, unsigned long error_code,
> +		unsigned long address)
>  {
>  	struct vm_area_struct *vma;
>  	struct task_struct *tsk;
> -	unsigned long address;
>  	struct mm_struct *mm;
>  	int fault;
>  	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
> @@ -1028,9 +1028,6 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
>  	tsk = current;
>  	mm = tsk->mm;
>  
> -	/* Get the faulting address: */
> -	address = read_cr2();
> -
>  	/*
>  	 * Detect and handle instructions that would cause a page fault for
>  	 * both a tracked kernel page and a userspace page.
> @@ -1248,9 +1245,11 @@ dotraplinkage void __kprobes
>  do_page_fault(struct pt_regs *regs, unsigned long error_code)
>  {
>  	enum ctx_state prev_state;
> +	/* Get the faulting address: */
> +	unsigned long address = read_cr2();
>  
>  	prev_state = exception_enter();
> -	__do_page_fault(regs, error_code);
> +	__do_page_fault(regs, error_code, address);
>  	exception_exit(prev_state);
>  }
>  
> @@ -1267,9 +1266,16 @@ dotraplinkage void __kprobes
>  trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
>  {
>  	enum ctx_state prev_state;
> +	/*
> +	 * The exception_enter and tracepoint processing could
> +	 * trigger another page faults (user space callchain
> +	 * reading) and destroy the original cr2 value, so read
> +	 * the faulting address now.
> +	 */
> +	unsigned long address = read_cr2();
>  
>  	prev_state = exception_enter();
>  	trace_page_fault_entries(regs, error_code);
> -	__do_page_fault(regs, error_code);
> +	__do_page_fault(regs, error_code, address);
>  	exception_exit(prev_state);
>  }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/