linux-kernel - Re: [PATCH tip/sched/core v2] sched/rt: Simplify the IPI rt balancing logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170504150355.0524c8e8@gandalf.local.home>
Date:   Thu, 4 May 2017 15:03:55 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        Clark Williams <williams@...hat.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        John Kacur <jkacur@...hat.com>, Scott Wood <swood@...hat.com>
Subject: Re: [PATCH tip/sched/core v2] sched/rt: Simplify the IPI rt
 balancing logic

On Thu, 4 May 2017 20:42:00 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> On Thu, May 04, 2017 at 01:25:38PM -0400, Steven Rostedt wrote:
> > > I think you want to write that as:
> > > 
> > > 	struct root_domain *rd = rq->rd;
> > > 	int cpu, next;
> > > 
> > > 	/* comment */
> > > 	for (;;) { 
> > > 		if (rd->rto_cpu >= nr_cpu_ids) {  
> > 
> > If we go with your change, then this needs to be:
> > 
> > 		if (rd->rto_cpu < 0) {
> >   
> > > 			cpu = cpumask_first(rd->rto_mask);
> > > 			rd->rto_cpu = cpu;
> > > 			return cpu;
> > > 		}  
> 
> No you can leave it out entirely.
> 
> > > 
> > > 		cpu = cpumask_next(rd->rto_mask);  
> > 
> > cpumask_next() requires two parameters.  
> 
> Indeed it does:
> 
> 		cpu = cpumask_next(rd->rto_cpu, rd->rto_mask);
> 
> will be cpumask_first() when rto_cpu == -1, see for example
> for_each_cpu().

OK, I'll have to take a look.

> 
> > > > +static inline bool rto_start_trylock(atomic_t *v)
> > > > +{
> > > > +	return !atomic_cmpxchg(v, 0, 1);    
> > > 
> > > Arguably this could be: !atomic_cmpxchg_acquire(v, 0, 1);  
> > 
> > Yes agreed. But if you remember, at the time I was basing this off of
> > tip/sched/core, which didn't have atomic_cmpxchg_acquire() available.  
> 
> No that's the try_cmpxchg stuff, the _acquire stuff is long in.

OK, I was confused with the try stuff, as I remember that was what you
suggested before.

> 
> > Thanks for the review. I'll spin up a new patch. Unfortunately, I no
> > longer have access to the behemoth machine. I'll only be testing this
> > on 4 cores now, or 8 with HT.  
> 
> I have something with 144 CPUs in or thereabout, if you have the
> testcase handy I can give it a spin.

My test case is two fold. It basically just involves running rteval.

One is to run it on latest mainline to make sure it doesn't crash. The
other is to backport it to the latest -rt patch, and see how well it
helps with latency.

To get rteval:

 $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rteval.git
 $ cd rteval
 $ git checkout origin/v2/master

-- Steve