linux-kernel - Re: [PATCH RFC tip/core/rcu 11/11] rcu: move TREE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110223165043.GA2529@nowhere>
Date:	Wed, 23 Feb 2011 17:50:46 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	mathieu.desnoyers@...ymtl.ca, josh@...htriplett.org,
	niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	eric.dumazet@...il.com, darren@...art.com,
	"Paul E. McKenney" <paul.mckenney@...aro.org>
Subject: Re: [PATCH RFC tip/core/rcu 11/11] rcu: move TREE_RCU from softirq
 to kthread

On Tue, Feb 22, 2011 at 05:39:40PM -0800, Paul E. McKenney wrote:
> +}
> +
> +/*
> + * Drop to non-real-time priority and yield, but only after posting a
> + * timer that will cause us to regain our real-time priority if we
> + * remain preempted.  Either way, we restore our real-time priority
> + * before returning.
> + */
> +static void rcu_yield(int cpu)
> +{
> +	struct rcu_data *rdp = per_cpu_ptr(rcu_sched_state.rda, cpu);
> +	struct sched_param sp;
> +	struct timer_list yield_timer;
> +
> +	setup_timer(&yield_timer, rcu_cpu_kthread_timer, (unsigned long)rdp);
> +	mod_timer(&yield_timer, jiffies + 2);
> +	sp.sched_priority = 0;
> +	sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> +	schedule();
> +	sp.sched_priority = RCU_KTHREAD_PRIO;
> +	sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
> +	del_timer(&yield_timer);
> +}
> +
> +/*
> + * Handle cases where the rcu_cpu_kthread() ends up on the wrong CPU.
> + * This can happen while the corresponding CPU is either coming online
> + * or going offline.  We cannot wait until the CPU is fully online
> + * before starting the kthread, because the various notifier functions
> + * can wait for RCU grace periods.  So we park rcu_cpu_kthread() until
> + * the corresponding CPU is online.
> + *
> + * Return 1 if the kthread needs to stop, 0 otherwise.
> + *
> + * Caller must disable bh.  This function can momentarily enable it.
> + */
> +static int rcu_cpu_kthread_should_stop(int cpu)
> +{
> +	while (cpu_is_offline(cpu) || smp_processor_id() != cpu) {
> +		if (kthread_should_stop())
> +			return 1;
> +		local_bh_enable();
> +		schedule_timeout_uninterruptible(1);

Why is it uninterruptible? Well that doesn't change much anyway.
It can be a problem for long time sleeping kernel threads because of
the hung task detector, but certainly not for 1 jiffy.

> +		if (smp_processor_id() != cpu)
> +			set_cpus_allowed_ptr(current, cpumask_of(cpu));
> +		local_bh_disable();
> +	}
> +	return 0;
> +}
> +
> +/*
> + * Per-CPU kernel thread that invokes RCU callbacks.  This replaces the
> + * earlier RCU softirq.
> + */
> +static int rcu_cpu_kthread(void *arg)
> +{
> +	int cpu = (int)(long)arg;
> +	unsigned long flags;
> +	int spincnt = 0;
> +	wait_queue_head_t *wqp = &per_cpu(rcu_cpu_wq, cpu);
> +	char work;
> +	char *workp = &per_cpu(rcu_cpu_has_work, cpu);
> +
> +	for (;;) {
> +		wait_event_interruptible(*wqp,
> +					 *workp != 0 || kthread_should_stop());
> +		local_bh_disable();
> +		if (rcu_cpu_kthread_should_stop(cpu)) {
> +			local_bh_enable();
> +			break;
> +		}
> +		local_irq_save(flags);
> +		work = *workp;
> +		*workp = 0;
> +		local_irq_restore(flags);
> +		if (work)
> +			rcu_process_callbacks();
> +		local_bh_enable();
> +		if (*workp != 0)
> +			spincnt++;
> +		else
> +			spincnt = 0;
> +		if (spincnt > 10) {
> +			rcu_yield(cpu);
> +			spincnt = 0;
> +		}
> +	}
> +	return 0;
> +}
> +
> +/*
> + * Per-rcu_node kthread, which is in charge of waking up the per-CPU
> + * kthreads when needed.
> + */
> +static int rcu_node_kthread(void *arg)
> +{
> +	int cpu;
> +	unsigned long flags;
> +	unsigned long mask;
> +	struct rcu_node *rnp = (struct rcu_node *)arg;
> +	struct sched_param sp;
> +	struct task_struct *t;
> +
> +	for (;;) {
> +		wait_event_interruptible(rnp->node_wq, rnp->wakemask != 0 ||
> +						       kthread_should_stop());
> +		if (kthread_should_stop())
> +			break;
> +		raw_spin_lock_irqsave(&rnp->lock, flags);
> +		mask = rnp->wakemask;
> +		rnp->wakemask = 0;
> +		raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +		for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++, mask <<= 1) {
> +			if ((mask & 0x1) == 0)
> +				continue;
> +			preempt_disable();
> +			per_cpu(rcu_cpu_has_work, cpu) = 1;
> +			t = per_cpu(rcu_cpu_kthread_task, cpu);
> +			if (t == NULL) {
> +				preempt_enable();
> +				continue;
> +			}
> +			sp.sched_priority = RCU_KTHREAD_PRIO;
> +			sched_setscheduler_nocheck(t, cpu, &sp);
> +			wake_up_process(t);

My (mis?)understanding of the picture is this node kthread is there to
wake up cpu threads that called rcu_yield(). But actually rcu_yield()
doesn't make the cpu thread sleeping, instead it switches to SCHED_NORMAL,
to avoid starving the system with callbacks.

So I wonder if this wake_up_process() is actually relevant.
sched_setscheduler_nocheck() already handles the per sched policy rq migration
and the process is not sleeping.

That said, by the time the process may have gone to sleep, because if no other
SCHED_NORMAL task was there, it has just continued and may have flushed
every callbacks. So this wake_up_process() may actually wake up the task
but it will sleep again right away due to the condition in wait_event_interruptible()
of the cpu thread.

Right?

> +			preempt_enable();
> +		}
> +	}
> +	return 0;
> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/