lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6aefb972-3691-2b66-a189-97815df10a12@linux.alibaba.com>
Date:   Tue, 14 Sep 2021 10:02:07 +0800
From:   ηŽ‹θ΄‡ <yun.wang@...ux.alibaba.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        "open list:PERFORMANCE EVENTS SUBSYSTEM" 
        <linux-perf-users@...r.kernel.org>,
        "open list:PERFORMANCE EVENTS SUBSYSTEM" 
        <linux-kernel@...r.kernel.org>,
        "open list:BPF (Safe dynamic programs and tools)" 
        <netdev@...r.kernel.org>,
        "open list:BPF (Safe dynamic programs and tools)" 
        <bpf@...r.kernel.org>
Subject: Re: [RFC PATCH] perf: fix panic by mark recursion inside
 perf_log_throttle



On 2021/9/13 δΈ‹εˆ6:36, Peter Zijlstra wrote:
> On Mon, Sep 13, 2021 at 12:24:24PM +0200, Peter Zijlstra wrote:
> 
> FWIW:
> 
>> I'm confused tho; where does the #DF come from? Because taking a #PF
>> from NMI should be perfectly fine.
>>
>> AFAICT that callchain is something like:
>>
>> 	NMI
>> 	  perf_event_nmi_handler()
>> 	    (part of the chain is missing here)
>> 	      perf_log_throttle()
>> 	        perf_output_begin() /* events/ring_buffer.c */
>> 		  rcu_read_lock()
>> 		    rcu_lock_acquire()
>> 		      lock_acquire()
>> 		        trace_lock_acquire() --> perf_trace_foo
> 
> This function also calls perf_trace_buf_alloc(), and will have
> incremented the recursion count, such that:
> 
>>
>> 			  ...
>> 			    perf_callchain()
>> 			      perf_callchain_user()
>> 			        #PF (fully expected during a userspace callchain)
>> 				  (some stuff, until the first __fentry)
>> 				    perf_trace_function_call
>> 				      perf_trace_buf_alloc()
>> 				        perf_swevent_get_recursion_context()
>> 					  *BOOM*
> 
> this one, if it wouldn't mysteriously explode, would find recursion and
> terminate, except that seems to be going side-ways.

Yes, it supposed to avoid recursion in the same context, but it never got
chance to do that, the function and struct should all be fine, any idea
in such situation what can trigger this kind of double fault?

Regards,
Michael Wang

> 
>> Now, supposedly we then take another #PF from get_recursion_context() or
>> something, but that doesn't make sense. That should just work...
>>
>> Can you figure out what's going wrong there? going with the RIP, this
>> almost looks like 'swhash->recursion' goes splat, but again that makes
>> no sense, that's a per-cpu variable.
>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ