linux-kernel - Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <07af01d050d0$8ba39e80$a2eadb80$@alibaba-inc.com>
Date:	Wed, 25 Feb 2015 15:56:21 +0800
From:	"Hillf Danton" <hillf.zj@...baba-inc.com>
To:	"Steven Rostedt" <rostedt@...dmis.org>
Cc:	"Ingo Molnar" <mingo@...nel.org>,
	"Peter Zijlstra" <peterz@...radead.org>,
	"'Thomas Gleixner'" <tglx@...utronix.de>,
	"'Clark Williams'" <williams@...hat.com>,
	"'Mike Galbraith'" <umgwanakikbuti@...il.com>,
	"linux-kernel" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

> +static void try_to_push_tasks(void *arg)
> +{
> +	struct rt_rq *rt_rq = arg;
> +	struct rq *rq, *next_rq;
> +	int next_cpu = -1;
> +	int next_prio = MAX_PRIO + 1;
> +	int this_prio;
> +	int src_prio;
> +	int prio;
> +	int this_cpu;
> +	int success;
> +	int cpu;
> +
> +	/* Make sure we can see csd_cpu */
> +	smp_rmb();
> +
> +	this_cpu = rt_rq->push_csd_cpu;
> +
> +	/* Paranoid check */
> +	BUG_ON(this_cpu != smp_processor_id());
> +
> +	rq = cpu_rq(this_cpu);
> +
> +	/*
> +	 * If there's nothing to push here, then see if another queue
> +	 * can push instead.
> +	 */
> +	if (!has_pushable_tasks(rq))
> +		goto pass_the_ipi;
> +
> +	raw_spin_lock(&rq->lock);
> +	success = push_rt_task(rq);
> +	raw_spin_unlock(&rq->lock);
> +
> +	if (success)
> +		goto done;

The latency, 150us over a 20 hour run, goes up if we goto done directly?
Hillf
> +
> +	/* Nothing was pushed, try another queue */
> +pass_the_ipi:
> +
> +	/*
> +	 * We use the priority that determined to send to this CPU
> +	 * even if the priority for this CPU changed. This is used
> +	 * to determine what other CPUs to send to, to keep from
> +	 * doing a ping pong from each CPU.
> +	 */
> +	this_prio = rt_rq->push_csd_prio;
> +	src_prio = rt_rq->highest_prio.curr;
> +
> +	for_each_cpu(cpu, rq->rd->rto_mask) {
> +		if (this_cpu == cpu)
> +			continue;
> +
> +		/*
> +		 * This function was called because some rq lowered its
> +		 * priority. It then searched for the highest priority
> +		 * rq that had overloaded tasks and sent an smp function
> +		 * call to that cpu to call this function to push its
> +		 * tasks. But when it got here, the task was either
> +		 * already pushed, or due to affinity, could not move
> +		 * the overloaded task.
> +		 *
> +		 * Now we need to see if there's another overloaded rq that
> +		 * has an RT task that can migrate to that CPU.
> +		 *
> +		 * We need to be careful, we do not want to cause a ping
> +		 * pong between this CPU and another CPU that has an RT task
> +		 * that can migrate, but not to the CPU that lowered its
> +		 * priority. Since the lowering priority CPU finds the highest
> +		 * priority rq to send to, we will ignore any rq that is of higher
> +		 * priority than this current one. That is, if a rq scheduled a
> +		 * task of higher priority, the schedule itself would do the
> +		 * push or pull then. We can safely ignore higher priority rqs.
> +		 * And if there's one that is the same priority, since the CPUS
> +		 * are searched in order we will ignore CPUS of the same priority
> +		 * unless the CPU number is greater than this CPU's number.
> +		 */
> +		next_rq = cpu_rq(cpu);
> +
> +		/* Use a single read for the next prio for decision making */
> +		prio = READ_ONCE(next_rq->rt.highest_prio.next);
> +
> +		/* Looking for highest priority */
> +		if (prio >= next_prio)
> +			continue;
> +
> +		/* Make sure that the rq can push to the source rq */
> +		if (prio >= src_prio)
> +			continue;
> +
> +		/* If the prio is higher than the current prio, ignore it */
> +		if (prio < this_prio)
> +			continue;
> +
> +		/*
> +		 * If the prio is equal to the current prio, only use it
> +		 * if the cpu number is greater than the current cpu.
> +		 * This prevents a ping pong effect.
> +		 */
> +		if (prio == this_prio && cpu < this_cpu)
> +			continue;
> +
> +		next_prio = prio;
> +		next_cpu = cpu;
> +	}
> +
> +	/* Nothing found, do nothing */
> +	if (next_cpu < 0)
> +		goto done;
> +
> +	/*
> +	 * Now we can not send another smp async function due to locking,
> +	 * use irq_work instead.
> +	 */
> +
> +	rt_rq->push_csd_cpu = next_cpu;
> +	rt_rq->push_csd_prio = next_prio;
> +
> +	/* Make sure the next cpu is seen on remote CPU */
> +	smp_mb();
> +
> +	irq_work_queue_on(&rt_rq->push_csd_work, next_cpu);
> +
> +	return;
> +
> +done:
> +	rt_rq->push_csd_pending = 0;
> +
> +	/* Now make sure the src CPU can see this update */
> +	smp_wmb();
> +}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/