linux-kernel - Re: [PATCH 0/6] x86/cpu hotplug: Wake up offline CPU via mwait or nmi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120605221240.GW2388@linux.vnet.ibm.com>
Date:	Tue, 5 Jun 2012 15:12:40 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	"Luck, Tony" <tony.luck@...el.com>,
	"Yu, Fenghua" <fenghua.yu@...el.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Ingo Molnar <mingo@...e.hu>, H Peter Anvin <hpa@...or.com>,
	"Siddha, Suresh B" <suresh.b.siddha@...el.com>,
	"Mallick, Asit K" <asit.k.mallick@...el.com>,
	Arjan Dan De Ven <arjan@...ux.intel.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	x86 <x86@...nel.org>, linux-pm <linux-pm@...r.kernel.org>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Subject: Re: [PATCH 0/6] x86/cpu hotplug: Wake up offline CPU via mwait or nmi

On Tue, Jun 05, 2012 at 11:30:56PM +0200, Peter Zijlstra wrote:
> On Tue, 2012-06-05 at 22:47 +0200, Thomas Gleixner wrote:
> > On Tue, 5 Jun 2012, Peter Zijlstra wrote:
> > > On Tue, 2012-06-05 at 21:43 +0200, Thomas Gleixner wrote:
> > > > Vs. the interrupt/timer/other crap madness:
> > > > 
> > > >  - We really don't want to have an interrupt balancer in the kernel
> > > >    again, but we need a mechanism to prevent the user space balancer
> > > >    trainwreck from ruining the power saving party.
> > > 
> > > What's wrong with having an interrupt balancer tied to the scheduler
> > > which optimistically tries to avoid interrupting nohz/isolated/idle
> > > cpus?
> > 
> > You want to run through a boatload of interrupts and change their
> > affinity from the load balancer or something related? Not really.
> 
> Well, no not like that, but I think we could do with some coupling
> there. Like steer active interrupts away when they keep hitting idle
> state.

But the guys who are more fanatic about performance than about energy
efficiency would -want- the interrupts to hit the idle CPUs, right?

> > > >  - The other details (silly IPIs) and cross CPU timer arming) are way
> > > >    easier to solve by a proper prohibitive state than by chasing that
> > > >    nonsense all over the tree forever. 
> > > 
> > > But we need to solve all that without a prohibitibe state anyway for the
> > > isolation stuff to be useful.
> > 
> > And what is preventing us to use a prohibitive state for that purpose?
> > The isolation stuff Frederic is working on is nothing else than
> > dynamically switching in and out of a prohibitive state.
> 
> I don't think so. Its perfectly fine to get TLB invalidate IPIs or
> resched-IPIs or any other kind of kernel work that needs doing. Its even
> fine for timers to happen. What's not fine is getting spurious IPIs when
> there's no work to do, or getting timers from another workload.

One desirable property of CPU hotplug is that it puts the CPU in a state
where it no longer needs to receive TLB invalidations, resched IPIs, etc.

> > I completely understand your reasoning, but I seriously doubt that we
> > can educate the whole crowd to understand the problems at hand. My
> > experience in the last 10+ years tells me that if you do not restrict
> > stuff you enter a never ending "chase the human stupidity^Wcreativity"
> > game. Even if you restrict it massively you end up observing a patch
> > which does:
> > 
> > +       d->core_internal_state__do_not_mess_with_it |= SOME_CONSTANT;
> > 
> > So do you really want to promote a solution which requires brain
> > sanity of all involved parties?
> 
> I just don't see a way to hard-wall interrupt sources, esp. when they
> might be perfectly fine or even required for the correct operation of
> the machine and desired workload.
> 
> kstopmachine -- however much we all love that thing -- will need to stop
> all cpus and violate isolation barriers.
> 
> RCU has similar nasties.

I am working to rid RCU of this sort of thing.  I have rcu_barrier() so
that it avoids messing with CPUs that don't have callbacks, which will
be almost all of the idle CPUs, especially for CONFIG_RCU_FAST_NO_HZ=y.
I believe that I have also removed all of RCU's dependencies on CPU
hotplug's using kstopmachine, though Murphy would say otherwise.

I still need to fix up synchronize_sched_expedited(), but that is on
the list.  I considered getting rid of this one, but I am probably going
to have to make synchronize_sched() map to it during boot time to keep
the boot-speed demons satisfied.

> > What's wrong with making a 'hotplug' model which provides the
> > following states:
> 
> For one calling it hotplug ;-)

OK, what would you want to call it?  CPU quiesce with different levels
of quiescence?  CPU cripple?  CPU curfew?  Something else?

> >   Fully functional
> > 
> >   Isolated functional
> > 
> >   Isolated idle
> 
> I can see the isolated idle, but we can implement that as an idle state
> and have smp_send_reschedule() do the magic wakeup. This should even
> work for crippled hardware.
> 
> What I can't see is the isolated functional, aside from the above
> mentioned things, that's not strictly a per-cpu property, we can have a
> group that's isolated from the rest but not from each other.

I suspect that Thomas is thinking that the CPU is so idle that it no
longer has to participate in TLB invalidation or RCU.  (Thomas will
correct me if I am confused.)  But Peter, is that the level of idle
you are thinking of?

							Thanx, Paul

> > Note, that these upper states are not 'hotplug' by definition, but
> > they have to be traversed by hot(un)plug as well. So why not making
> > them explicit states which we can exploit for the other problems we
> > want to solve?
> 
> I think I can agree with what you call isolated-idle, as long as we
> expose that as a generic idle state and put some magic in
> smp_send_reschedule(). But ideally we'd conceive a better name than
> hotplug for all this and only call the transition to down to 'physical
> hotplug mess' hotplug.
> 
> > That puts the burden on the core facility design, but it removes the
> > maintainence burden to chase a gazillion of instances doing IPIs,
> > cross cpu function calls, add_timer_on, add_work_on and whatever
> > nonsense.
> 
> I'd love for something like that to exist and work, I'm just not seeing
> how it could.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/