lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150826231326.GE11992@lerouge>
Date:	Thu, 27 Aug 2015 01:13:27 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Hideaki Kimura <hideaki.kimura@....com>
Cc:	Jason Low <jason.low2@...com>, Oleg Nesterov <oleg@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Rik van Riel <riel@...hat.com>,
	Scott J Norton <scott.norton@...com>
Subject: Re: [PATCH 0/3] timer: Improve itimers scalability

On Wed, Aug 26, 2015 at 03:53:26PM -0700, Hideaki Kimura wrote:
> Sure, let me elaborate.
> 
> Executive summary:
>  Yes, enabling a process-wide timer in such a large machine is not wise, but
> sometimes users/applications cannot avoid it.
> 
> 
> The issue was observed actually not in a database itself but in a common
> library it links to; gperftools.
> 
> The database itself is optimized for many-cores/sockets, so surely it avoids
> putting a process-wide timer or other unscalable things. It just links to
> libprofiler for an optional feature to profile performance bottleneck only
> when the user turns it on. We of course avoid turning the feature on unless
> while we debug/tune the database.
> 
> However, libprofiler sets the timer even when the client program doesn't
> invoke any of its functions: libprofiler does it when the shared library is
> loaded. We requested the developer of libprofiler to change the behavior,
> but seems like there is a reason to keep that behavior:
>   https://code.google.com/p/gperftools/issues/detail?id=133
> 
> Based on this, I think there are two reasons why we should ameliorate this
> issue in kernel layer.
> 
> 
> 1. In the particular case, it's hard to prevent or even detect the issue in
> user space.
> 
> We (a team of low-level database and kernel experts) in fact spent huge
> amount of time to just figure out what's the bottleneck there because
> nothing measurable happens in user space. I pulled out countless hairs.
> 
> Also, the user has to de-link the library from the application to prevent
> the itimer installation. Imagine a case where the software is proprietary.
> It won't fly.
> 
> 
> 2. This is just one example. There could be many other such
> binaries/libraries that do similar things somewhere in a complex software
> stack.
> 
> Today we haven't heard of many such cases, but people will start hitting it
> once 100s~1,000s of cores become common.
> 
> 
> After applying this patchset, we have observed that the performance hit
> almost completely went away at least for 240 cores. So, it's quite
> beneficial in real world.

I can easily imagine that many code incidentally use posix cpu timers when
it's not strictly required. But it doesn't look right to fix the kernel
for that. For this simple reason: posix cpu timers, even after your fix,
should introduce noticeable overhead. All threads of a process with a timer
enqueued in elapse the cputime in a shared atomic variable. Add to that the
overhead of enqueuing the timer, firing it. There is a bunch of scalability
issue there.

> 
> -- 
> Hideaki Kimura
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ