linux-kernel - Re: [patch 1/2] nohz: only wakeup a single target cpu when kicking a task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201013171328.GA19284@fuller.cnet>
Date:   Tue, 13 Oct 2020 14:13:28 -0300
From:   Marcelo Tosatti <mtosatti@...hat.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org,
        Nitesh Narayan Lal <nitesh@...hat.com>,
        Peter Xu <peterx@...hat.com>
Subject: Re: [patch 1/2] nohz: only wakeup a single target cpu when kicking a
 task

On Thu, Oct 08, 2020 at 09:54:44PM +0200, Frederic Weisbecker wrote:
> On Thu, Oct 08, 2020 at 02:54:09PM -0300, Marcelo Tosatti wrote:
> > On Thu, Oct 08, 2020 at 02:22:56PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 07, 2020 at 03:01:52PM -0300, Marcelo Tosatti wrote:
> > > > When adding a tick dependency to a task, its necessary to
> > > > wakeup the CPU where the task resides to reevaluate tick
> > > > dependencies on that CPU.
> > > > 
> > > > However the current code wakes up all nohz_full CPUs, which 
> > > > is unnecessary.
> > > > 
> > > > Switch to waking up a single CPU, by using ordering of writes
> > > > to task->cpu and task->tick_dep_mask.
> > > > 
> > > > From: Frederic Weisbecker <frederic@...nel.org>
> > > > Suggested-by: Peter Zijlstra <peterz@...radead.org>
> > > > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > > > Signed-off-by: Marcelo Tosatti <mtosatti@...hat.com>
> > > > 
> > > > Index: linux-2.6/kernel/time/tick-sched.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/kernel/time/tick-sched.c
> > > > +++ linux-2.6/kernel/time/tick-sched.c
> > > > @@ -274,6 +274,31 @@ void tick_nohz_full_kick_cpu(int cpu)
> > > >  	irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
> > > >  }
> > > >  
> > > > +static void tick_nohz_kick_task(struct task_struct *tsk)
> > > > +{
> > > > +	int cpu = task_cpu(tsk);
> > > > +
> > > > +	/*
> > > > +	 * If the task concurrently migrates to another cpu,
> > > > +	 * we guarantee it sees the new tick dependency upon
> > > > +	 * schedule.
> > > > +	 *
> > > > +	 *
> > > > +	 * set_task_cpu(p, cpu);
> > > > +	 *   STORE p->cpu = @cpu
> > > > +	 * __schedule() (switch to task 'p')
> > > > +	 *   LOCK rq->lock
> > > > +	 *   smp_mb__after_spin_lock()          STORE p->tick_dep_mask
> > > > +	 *   tick_nohz_task_switch()            smp_mb() (atomic_fetch_or())
> > > > +	 *      LOAD p->tick_dep_mask           LOAD p->cpu
> > > > +	 */
> > > > +
> > > > +	preempt_disable();
> > > > +	if (cpu_online(cpu))
> > > > +		tick_nohz_full_kick_cpu(cpu);
> > > > +	preempt_enable();
> > > > +}
> > > 
> > > So we need to kick the CPU unconditionally, or only when the task is
> > > actually running? AFAICT we only care about current->tick_dep_mask.
> > 
> > tick is necessary to execute run_posix_cpu_timers, from tick interrupt, 
> > even if task is not running.
> 
> Yes but if the task isn't running, run_posix_cpu_timers() doesn't have
> anything to elapse. So indeed we can spare the IPI if the task is not
> running. Provided ordering makes sure that the task sees the new dependency
> when it schedules in of course.

True.

 * p->on_cpu <- { 0, 1 }:
 *
 *   is set by prepare_task() and cleared by finish_task() such that it will be
 *   set before p is scheduled-in and cleared after p is scheduled-out, both
 *   under rq->lock. Non-zero indicates the task is running on its CPU.


CPU-0 (tick_set_dep)            CPU-1 (task switch)

STORE p->tick_dep_mask
smp_mb() (atomic_fetch_or())
LOAD p->on_cpu


                                context_switch(prev, next)
                                STORE next->on_cpu = 1
                                ...                             [*]

                                LOAD current->tick_dep_mask


Don't see any explicit memory barrier in the [*] section?