lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <650783164.16607.1593486812734.JavaMail.zimbra@efficios.com>
Date:   Mon, 29 Jun 2020 23:13:32 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     rostedt <rostedt@...dmis.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Yordan Karadzhov <y.karadz@...il.com>,
        Tzvetomir Stoyanov <tz.stoyanov@...il.com>,
        Tom Zanussi <zanussi@...nel.org>,
        Jason Behmer <jbehmer@...gle.com>,
        Julia Lawall <julia.lawall@...ia.fr>,
        Clark Williams <williams@...hat.com>,
        bristot <bristot@...hat.com>, Daniel Wagner <wagi@...om.org>,
        Darren Hart <dvhart@...are.com>,
        Jonathan Corbet <corbet@....net>,
        "Suresh E. Warrier" <warrier@...ux.vnet.ibm.com>,
        "Joel Fernandes, Google" <joel@...lfernandes.org>
Subject: Re: [RFC][PATCH] ring-buffer: Have nested events still record
 running time stamp

----- On Jun 29, 2020, at 8:21 PM, rostedt rostedt@...dmis.org wrote:

> On Thu, 25 Jun 2020 15:35:02 -0400 (EDT)
> Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> 
>> >> So the reservation is not "just" an add instruction, it's actually an
>> >> xadd on x86. Is that really faster than a cmpxchg ?
>> > 
>> > I believe the answer is still yes. But I can run some benchmarks to
>> > make sure.
>> 
>> This would be interesting to see, because if xadd and cmpxchg have
>> similar overhead, then going for a cmpxchg-loop for the space
>> reservation could vastly decrease the overall complexity of this
>> timestamp+space reservation algorithm.
> 
> So I tested this on three different machines to see how it compared. I
> applied this patch:
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index c2c0d25ea004..8a7d62375eb3 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3161,7 +3161,19 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
> 
>  /*B*/	rb_time_set(&cpu_buffer->before_stamp, info->ts);
> 
> +#if 0
>  /*C*/	write = local_add_return(info->length, &tail_page->write);
> +#else
> +	{
> +		unsigned long expect, result;
> +
> +		do {
> +			expect = local_read(&tail_page->write);
> +			write = expect + info->length;
> +			result = local_cmpxchg(&tail_page->write, expect, write);
> +		} while (expect != result);
> +	}
> +#endif
> 
> 	/* set write to only the index of the write */
> 	write &= RB_WRITE_MASK;
> 
> 
> And tested before and after the results.

[...]
> 
> Not that big of a difference, but there is a difference. And the
> standard deviation appears to be better, which would make sense as an
> local_add_return() needs no repeat on collision.

I tried running your recommended test on my machine many times with the same
configuration. It appears that the avg and std dev results vary a lot between
runs of the exact same configuration:

Model name:          Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

* With xadd (existing code):

 event_benchmark-1828  [016]    71.309475: benchmark_event:      last=190 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309476: benchmark_event:      last=192 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309476: benchmark_event:      last=195 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309477: benchmark_event:      last=193 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309478: benchmark_event:      last=190 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309479: benchmark_event:      last=190 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309479: benchmark_event:      last=192 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309480: benchmark_event:      last=192 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309481: benchmark_event:      last=195 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309481: benchmark_event:      last=193 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437


 event_benchmark-1865  [003]   162.955546: benchmark_event:      last=489 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955546: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955547: benchmark_event:      last=204 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955548: benchmark_event:      last=204 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955549: benchmark_event:      last=203 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955549: benchmark_event:      last=206 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955550: benchmark_event:      last=203 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955551: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955552: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955552: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057

 event_benchmark-1904  [003]   202.847955: benchmark_event:      last=205 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847956: benchmark_event:      last=500 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847957: benchmark_event:      last=355 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847958: benchmark_event:      last=340 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847959: benchmark_event:      last=462 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847960: benchmark_event:      last=278 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847961: benchmark_event:      last=243 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847962: benchmark_event:      last=205 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847963: benchmark_event:      last=470 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847964: benchmark_event:      last=364 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605

 event_benchmark-1941  [003]   234.577797: benchmark_event:      last=204 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577798: benchmark_event:      last=445 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577799: benchmark_event:      last=693 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577800: benchmark_event:      last=293 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577801: benchmark_event:      last=205 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577802: benchmark_event:      last=202 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577802: benchmark_event:      last=203 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577803: benchmark_event:      last=204 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577804: benchmark_event:      last=204 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577805: benchmark_event:      last=205 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989


 event_benchmark-1978  [003]   279.276351: benchmark_event:      last=205 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276352: benchmark_event:      last=203 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276353: benchmark_event:      last=207 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276354: benchmark_event:      last=207 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276355: benchmark_event:      last=205 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276355: benchmark_event:      last=205 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276356: benchmark_event:      last=204 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276357: benchmark_event:      last=206 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276358: benchmark_event:      last=286 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276359: benchmark_event:      last=202 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885


 event_benchmark-2015  [003]   309.974432: benchmark_event:      last=202 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974433: benchmark_event:      last=204 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974434: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974435: benchmark_event:      last=206 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974435: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974436: benchmark_event:      last=203 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974437: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974438: benchmark_event:      last=275 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974439: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974440: benchmark_event:      last=600 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323

-> This configuration's avg varies between 233 and 256 ns based on your benchmark methodology. The std. dev. varies between
141 and 333 ns.


* With your cmpxchg modification:

 event_benchmark-1256  [010]    56.943734: benchmark_event:      last=210 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943736: benchmark_event:      last=208 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943737: benchmark_event:      last=1223 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943738: benchmark_event:      last=676 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943739: benchmark_event:      last=337 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943740: benchmark_event:      last=455 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943741: benchmark_event:      last=280 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943742: benchmark_event:      last=246 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943743: benchmark_event:      last=205 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943743: benchmark_event:      last=207 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524

 event_benchmark-1293  [025]    84.352468: benchmark_event:      last=198 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352469: benchmark_event:      last=198 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352470: benchmark_event:      last=338 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352471: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352471: benchmark_event:      last=196 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352472: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352473: benchmark_event:      last=198 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352474: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352474: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352475: benchmark_event:      last=196 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469


 event_benchmark-1330  [014]   111.729937: benchmark_event:      last=473 first=3724 max=31525 min=187 avg=245 std=160 std^2=25626
 event_benchmark-1330  [014]   111.729939: benchmark_event:      last=1085 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729940: benchmark_event:      last=602 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729942: benchmark_event:      last=602 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729943: benchmark_event:      last=393 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729944: benchmark_event:      last=558 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729946: benchmark_event:      last=410 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729947: benchmark_event:      last=408 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729949: benchmark_event:      last=428 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729950: benchmark_event:      last=717 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627

 event_benchmark-1367  [025]   143.251162: benchmark_event:      last=209 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251163: benchmark_event:      last=420 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251164: benchmark_event:      last=413 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251165: benchmark_event:      last=209 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251165: benchmark_event:      last=205 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251166: benchmark_event:      last=209 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251167: benchmark_event:      last=208 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251168: benchmark_event:      last=207 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251169: benchmark_event:      last=208 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251169: benchmark_event:      last=210 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567

 event_benchmark-1404  [025]   182.750612: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750612: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750613: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750614: benchmark_event:      last=210 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750615: benchmark_event:      last=209 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750616: benchmark_event:      last=208 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750616: benchmark_event:      last=206 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750617: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750618: benchmark_event:      last=206 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750619: benchmark_event:      last=209 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319

-> This configuration's avg varies between 239 and 256 ns based on your benchmark methodology. The std. dev. varies between
138 and 201 ns.

In the benchmark results you collected, config #1 appeared to have little delta between xadd and cmpxchg, whereas
config #2 and #3 had some avg difference. Did you try running this test many times on each machine to see if the
result was indeed stable ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ