linux-kernel - Re: [RFC PATCH 00/32] Nohz cpusets (was: Nohz Tasks)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110830140648.GL9748@somewhere.redhat.com>
Date:	Tue, 30 Aug 2011 16:06:53 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Gilad Ben-Yossef <gilad@...yossef.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Anton Blanchard <anton@....ibm.com>,
	Avi Kivity <avi@...hat.com>, Ingo Molnar <mingo@...e.hu>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
	Paul Menage <menage@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Stephen Hemminger <shemminger@...tta.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tim Pepper <lnxninja@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH 00/32] Nohz cpusets (was: Nohz Tasks)

On Wed, Aug 24, 2011 at 05:41:05PM +0300, Gilad Ben-Yossef wrote:
> Hi,
> 
> On Mon, Aug 15, 2011 at 6:51 PM, Frederic Weisbecker <fweisbec@...il.com> wrote:
> >
> > For those who want to play:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> >        nohz/cpuset-v1
> 
> 
> You caught me in playful mood, so I took it for a spin... :-)
> 
> I know this is far from being production ready, but I hope you'll find
> the feedback useful.
> 
> First a short description of my testing setup is in order, I believe:
> 
> I've set up a small x86 VM with 4 CPUs running your git tree and a
> minimal buildroot system. I've created 2 cpusets: sys and nohz, and
> then assigned every task I could to the sys cpuset and set
> adaptive_nohz on the nohz set.
> 
> To make double sure I have no task on my nohz cpuset CPU, I've booted
> the system with the isolcpus command line isolating the same cpu I've
> assigned to the nohz set. This shouldn't be needed of course, but just
> in case.

Ah I haven't tested with that isolcpus especially as it's headed toward
removal.

> 
> I then ran a silly program I've written that basically eats CPU cycles
> (https://github.com/gby/cpueat) and assigned it to the nohz set and
> monitored the number of interrupts using /proc/interrupts
> 
> Now, for the things I've noticed -
> 
> 1. Before I turn adaptive_nohz to 1, when no task is running on the
> nohz cpuset cpu, the tick is indeed idle (regular nohz case) and very
> few function call IPIs are seen. However, when I turn adaptive_nohz to
> 1 (but still with no task running on the CPU), the tick remains idle,
> but I get an IPI function call interrupt almost in the rate the tick
> would have been.

Yeah I believe this is due to RCU that tries to wake up our nohz CPU.
I need to have a deeper look there.

> 2. When I run my little cpueat program on the nohz CPU, the tick does
> not actually goes off. Instead it ticks away as usual. I know it is
> the only legible task to run, since as soon as I kill it  the tick
> turns off (regular nohz mode again). I've tinkered around and found
> out that what stops the tick going away is the check for rcu_pending()
> in cpuset_nohz_can_stop_tick(). It seems to always be true. When I
> removed that check experimentally and repeat the test, the tick indeed
> stops with my cpueat task running. Of course, I don't suggest this is
> the sane thing to do - I just wondered if that what stopped the tick
> going away and it seems that it is.

Are you sure the tick never goes off?
But yeah may be there is something that constantly requires RCU grace
periods to complete in your system. I should drop the rcu_pending()
check as long as we want to stop the tick from userspace because
there we are off the RCU state machine.


> 3. My little cpueat program tries to fork a child process after 100k
> iteration of some CPU bound loop. It usually takes a few seconds to
> happen. The idea is to make sure that the tick resumes when nr_running
> > 1. In my case, I got a kernel panic. Since it happened with some
> debug code I added and with aforementioned experimental removal of
> rcu_pending check, I'm assuming for now it's all my fault but will
> look into verifying it further and will send panic logs if it proves
> useful.

I got some panic too but haven't seen any for some time. I made a
lot of changes since then though so I thought the condition to trigger
it just went away.

IIRC, it was a locking inversion against the rq lock and some other lock.
Very nice condition for a cool lockup ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/