lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1292918694.5021.194.camel@laptop>
Date:	Tue, 21 Dec 2010 09:04:54 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Anton Blanchard <anton@....ibm.com>,
	Tim Pepper <lnxninja@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH 10/15] nohz_task: Enter in extended quiescent state
 when in userspace

On Tue, 2010-12-21 at 02:27 +0100, Frederic Weisbecker wrote:
> On Mon, Dec 20, 2010 at 05:18:40PM +0100, Peter Zijlstra wrote:
> > On Mon, 2010-12-20 at 16:24 +0100, Frederic Weisbecker wrote:
> > > (we check if the local cpu is in nohz mode, which means
> > > no other task compete on that CPU)
> > 
> > You keep repeating that definition, but its not true.. It means there is
> > not work for the tick to do, the tick does _tons_ more besides
> > preemption, so nr_running==1 is necessary but not sufficient. 
> 
> Sure but the point is that if the tick is not running, it means
> that the nohz task is the only task running on that CPU.

No, that too isn't true, the cpu could be idle.

> Now indeed there are other reasons for the tick to restart like RCU
> or the effective nohz mode to physically happen or to be delayed,
> which is decided by tick_nohz_stop_sched_tick().

You really really really badly need to read through the whole tick path
and look at all the things it does, put them in a list, then look at the
current nohz code, mark those it deals with, then go through the nohz
code again and find the nr_running==0 assumptions, then make sure you've
covered everything.

I'm very confident you'll find a number of things to fix. At the very
least your current patch set totally forgets about task runtime
accounting (account_process_tick()), the existing NO_HZ doesn't need to
worry about that because the system is idle so all it needs to do is add
idle ticks when it wakes up (and possibly steal time for the virt muck).

You also miss the profile_tick(), and you need to go through the load
accounting muck (both of them) to see if there are any nr_running==0
assumptions there.

You also need to fix the perf counter list rotation stuff, and again,
check if no_hz load-balancing can deal with the nr_running==1 situation.

Now, there might be more, this is just a quick one-minute scan through
the tick code, but the fact that nowhere in this whole patch-set is even
a mention of these things makes me worry about your whole approach.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ