linux-kernel - Re: [PATCH 2/2] sched: update runqueue clock before migrations away

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xm2661qxq4dy.fsf@sword-of-the-dawn.mtv.corp.google.com>
Date:	Mon, 09 Dec 2013 10:13:29 -0800
From:	bsegall@...gle.com
To:	Chris Redpath <chris.redpath@....com>
Cc:	pjt@...gle.com, mingo@...hat.com, peterz@...radead.org,
	alex.shi@...aro.org, morten.rasmussen@....com,
	dietmar.eggemann@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] sched: update runqueue clock before migrations away

Chris Redpath <chris.redpath@....com> writes:

> If we migrate a sleeping task away from a CPU which has the
> tick stopped, then both the clock_task and decay_counter will
> be out of date for that CPU and we will not decay load correctly
> regardless of how often we update the blocked load.
>
> This is only an issue for tasks which are not on a runqueue
> (because otherwise that CPU would be awake) and simultaneously
> the CPU the task previously ran on has had the tick stopped.
>
> Signed-off-by: Chris Redpath <chris.redpath@....com>

This looks like it is basically correct, but it seems unfortunate to
take any rq lock for these ttwus. I don't know enough about the nohz
machinery to know if that's at all avoidable.


> ---
>  kernel/sched/fair.c |   30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b7e5945..0af1dc2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4324,6 +4324,7 @@ unlock:
>  	return new_cpu;
>  }
>  
> +static int nohz_test_cpu(int cpu);
>  /*
>   * Called immediately before a task is migrated to a new cpu; task_cpu(p) and
>   * cfs_rq_of(p) references at time of call are still valid and identify the
> @@ -4343,6 +4344,25 @@ migrate_task_rq_fair(struct task_struct *p, int next_cpu)
>  	 * be negative here since on-rq tasks have decay-count == 0.
>  	 */
>  	if (se->avg.decay_count) {
> +		/*
> +		 * If we migrate a sleeping task away from a CPU
> +		 * which has the tick stopped, then both the clock_task
> +		 * and decay_counter will be out of date for that CPU
> +		 * and we will not decay load correctly.
> +		 */
> +		if (!se->on_rq && nohz_test_cpu(task_cpu(p))) {
p->on_rq - se->on_rq must be false to call set_task_cpu at all. That
said, barring bugs like the one you fixed in patch 1 I think decay_count
!= 0 should also imply !p->on_rq.

> +			struct rq *rq = cpu_rq(task_cpu(p));
> +			unsigned long flags;
> +			/*
> +			 * Current CPU cannot be holding rq->lock in this
> +			 * circumstance, but another might be. We must hold
> +			 * rq->lock before we go poking around in its clocks
> +			 */
> +			raw_spin_lock_irqsave(&rq->lock, flags);
> +			update_rq_clock(rq);
> +			update_cfs_rq_blocked_load(cfs_rq, 0);
> +			raw_spin_unlock_irqrestore(&rq->lock, flags);
> +		}
>  		se->avg.decay_count = -__synchronize_entity_decay(se);
>  		atomic_long_add(se->avg.load_avg_contrib,
>  						&cfs_rq->removed_load);
> @@ -6507,6 +6527,11 @@ static struct {
>  	unsigned long next_balance;     /* in jiffy units */
>  } nohz ____cacheline_aligned;
>  
> +static int nohz_test_cpu(int cpu)
> +{
> +	return cpumask_test_cpu(cpu, nohz.idle_cpus_mask);
> +}
> +
>  static inline int find_new_ilb(int call_cpu)
>  {
>  	int ilb = cpumask_first(nohz.idle_cpus_mask);
> @@ -6619,6 +6644,11 @@ static int sched_ilb_notifier(struct notifier_block *nfb,
>  		return NOTIFY_DONE;
>  	}
>  }
> +#else
> +static int nohz_test_cpu(int cpu)
> +{
> +	return 0;
> +}
>  #endif
>  
>  static DEFINE_SPINLOCK(balancing);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/