linux-kernel - Re: [PATCH 4/6] sched/isolation: Residual 1Hz scheduler tick offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180129172018.GN2249@hirez.programming.kicks-ass.net>
Date:   Mon, 29 Jan 2018 18:20:18 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Chris Metcalf <cmetcalf@...lanox.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Luiz Capitulino <lcapitulino@...hat.com>,
        Christoph Lameter <cl@...ux.com>,
        "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
        Wanpeng Li <kernellwp@...il.com>,
        Mike Galbraith <efault@....de>, Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH 4/6] sched/isolation: Residual 1Hz scheduler tick offload

On Mon, Jan 29, 2018 at 05:48:33PM +0100, Frederic Weisbecker wrote:
> On Mon, Jan 29, 2018 at 04:38:39PM +0100, Peter Zijlstra wrote:
> > I would very much like a few words on why sched_class::task_tick() is
> > safe to call remote -- from a quick look I think it actually is, but it
> > would be good to have some words here.
> 
> Let's rather say I can't prove that it is safe, given the amount of code that
> is behind throughout the various flavour of scheduler features.
> 
> But as far as I checked several times, it seems that nothing is accessed locally
> on ::scheduler_tick(). Everything looks fetched from the runqueue struct target
> while it is locked.
> 
> If we ever find local references such as "current" or "__this_cpu_*" in the path,
> we'll have to fix them.

Sure, but at least state you audited the code for what issues. That
tells me you know wth you were doing and gives more trust than blindly
changing random code ;-)

> > > +static void sched_tick_start(int cpu)
> > > +{
> > > +	struct tick_work *twork;
> > > +
> > > +	if (housekeeping_cpu(cpu, HK_FLAG_TICK))
> > > +		return;
> > 
> > This all looks very static :-(, you can't reconfigure this nohz_full
> > crud after boot?
> 
> Unfortunately yes. In fact making the nohz interface dynamically available
> through cpuset is the next big step.

OK, fair enough.

> > > +	WARN_ON_ONCE(!tick_work_cpu);
> > > +
> > > +	twork = per_cpu_ptr(tick_work_cpu, cpu);
> > > +	twork->cpu = cpu;
> > > +	INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
> > > +	queue_delayed_work(system_unbound_wq, &twork->work, HZ);
> > > +}
> > 
> > Similarly, I think we want a few words about how unbound workqueues are
> > expected to behave vs NUMA.
> > 
> > AFAICT unbound workqueues by default prefer to run on a cpu in the same
> > node, but if no cpu is available, it doesn't go looking for the nearest
> > node that does have a cpu, it just punts to whatever random cpu.
> 
> Yes, and in fact you just made me look into wq_select_unbound_cpu() and
> it looks worse than that. If the current CPU is not in the wq_unbound_cpumask,
> a random one is picked up from that global cpumask without trying a near
> one in the current node.
> 
> Looks like room for improvement on the workqueue side. I'll see what I can do.

Great, thanks!