lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Sat, 14 Mar 2009 12:26:01 -0400
From:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
To:	ltt-dev@...ts.casi.polymtl.ca, linux-kernel@...r.kernel.org
Cc:	mbligh@...gle.com
Subject: Re: [ltt-dev] LTTng 0.108 provides many performance improvements

* Mathieu Desnoyers (compudj@...stal.dyndns.org) wrote:
> Hi,
> 
> I just released LTTng 0.108. Time had come to do a bit of performance
> tuning using oprofile.
> 
> Basically, the tbench workload, under flight recorder tracing, passed
> from a 52 % slowdown with previous lttng to a 32 % slowdown with lttng
> 0.108 on my test machine (8-cores x86_64, 16GB ram).
> 

Down to 30% performance impact by using a pointer array instead of a
linked list to manage the buffer pages (it's in 0.110). I ensure that
the accesses done on the array will never cause vmalloc faults (I think
the kernel could potentially fall back to vmalloc'd memory if the array
is too large to be allocated with kmalloc, but I'm unsure about this, as
I cannot find the offending code in 2.6.29-rc7). I call
vmalloc_sync_all() after allocating the array to make sure the TLBs are
populated (it's just safer).

Mathieu

> Modifications done :
> 
> - inlined fast paths. Modularity is now provided by the build system,
>   not by callbacks anymore. Selecting between lockless and locked buffer
>   management must be done at compile-time. I'd like to keep the
>   "transport" around because it will be used eventually to specify where
>   the information must be sent rather than selecting the buffer management
>   mechanism (e.g. sent to physical pages (contiguous or non-contiguous),
>   video card memory...). The "transport" option is still there, but it
>   currently does not do much. The slow paths are now done in function
>   calls.
> 
> - Fixed false sharing problem. It looks like the kzalloc_node()
>   allocator, used to allocate the commit counters, does not align the
>   memory allocated on cache lines.
> 
> Therefore I think the new code will be _much_ easier to optimize,
> because the fastpaths are very well identified and much smaller than
> they were before. I diminished the tracer stack space used, register
> usage and instruction cache usage.
> 
> Mathieu
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev@...ts.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ