lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 30 Mar 2016 10:38:53 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Alexei Starovoitov <ast@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	<linux-kernel@...r.kernel.org>,
	Brendan Gregg <brendan.d.gregg@...il.com>,
	He Kuang <hekuang@...wei.com>, Jiri Olsa <jolsa@...nel.org>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Namhyung Kim <namhyung@...nel.org>, <pi3orama@....com>,
	Zefan Li <lizefan@...wei.com>
Subject: Re: [PATCH 4/4] perf core: Add backward attribute to perf event



On 2016/3/30 10:28, Wangnan (F) wrote:
>
>
> On 2016/3/29 22:04, Peter Zijlstra wrote:
>> On Mon, Mar 28, 2016 at 06:41:32AM +0000, Wang Nan wrote:
>>
>> Could you maybe write a perf/tests thingy for this so that _some_
>> userspace exists that exercises this new code?
>>
>>
>>>   int perf_output_begin(struct perf_output_handle *handle,
>>>                 struct perf_event *event, unsigned int size)
>>>   {
>>> +    if (unlikely(is_write_backward(event)))
>>> +        return __perf_output_begin(handle, event, size, true);
>>>       return __perf_output_begin(handle, event, size, false);
>>>   }
>> Would something like:
>>
>> int perf_output_begin(...)
>> {
>>     if (unlikely(is_write_backward(event))
>>         return perf_output_begin_backward(...);
>>     return perf_output_begin_forward(...);
>> }
>>
>> make sense; I'm not sure how much is still using this, but it seems
>> somewhat excessive to inline two copies of that thing into a single
>> function.
>
> perf_output_begin is useful:
>
> $ grep perf_output_begin ./kernel -r
> ./kernel/events/ring_buffer.c:     * See perf_output_begin().
> ./kernel/events/ring_buffer.c:int perf_output_begin(struct 
> perf_output_handle *handle,
> ./kernel/events/ring_buffer.c:     * perf_output_begin() only checks 
> rb->paused, therefore
> ./kernel/events/core.c:    if (perf_output_begin(&handle, event, 
> header.size))
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event, 
> read_event.header.size);
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event,
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event,
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event,
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event, 
> rec.header.size);
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event,
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event, 
> se->event_id.header.size);
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event,
> ./kernel/events/core.c:    ret = perf_output_begin(&handle, event, 
> rec.header.size);
>
> Events like PERF_RECORD_MMAP2 uses this function, so we still need to 
> consider its overhead.
>
> So I will use your first suggestion.
>

Sorry. Your second suggestion seems also good:

My implementation makes a big perf_output_begin(), but introduces only 
one load and one branch.

Your first suggestion introduces one load, one branch and one function call.

Your second suggestion introduces one load, and at least one (and at 
most three) branches.

I need some benchmarking result.

Thank you.

Powered by blists - more mailing lists