[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140623164321.GA5543@redhat.com>
Date: Mon, 23 Jun 2014 18:43:21 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org, mingo@...nel.org,
laijs@...fujitsu.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com,
edumazet@...gle.com, dvhart@...ux.intel.com, fweisbec@...il.com,
sbw@....edu, Andi Kleen <ak@...ux.intel.com>,
Christoph Lameter <cl@...two.org>,
Mike Galbraith <umgwanakikbuti@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH RFC tip/core/rcu 1/5] rcu: Reduce overhead of
cond_resched() checks for RCU
On 06/20, Paul E. McKenney wrote:
>
> This commit takes a different approach to fixing this bug, mainly by
> avoiding having cond_resched() do an RCU-visible quiescent state unless
> there is a grace period that has been in flight for a significant period
> of time. This commit also reduces the common-case cond_resched() overhead
> to a check of a single per-CPU variable.
I can't say I fully understand this change, but I think it is fine.
Just a really stupid question below.
> +void rcu_resched(void)
> +{
> + unsigned long flags;
> + struct rcu_data *rdp;
> + struct rcu_dynticks *rdtp;
> + int resched_mask;
> + struct rcu_state *rsp;
> +
> + local_irq_save(flags);
> +
> + /*
> + * Yes, we can lose flag-setting operations. This is OK, because
> + * the flag will be set again after some delay.
> + */
> + resched_mask = raw_cpu_read(rcu_cond_resched_mask);
> + raw_cpu_write(rcu_cond_resched_mask, 0);
> +
> + /* Find the flavor that needs a quiescent state. */
> + for_each_rcu_flavor(rsp) {
> + rdp = raw_cpu_ptr(rsp->rda);
> + if (!(resched_mask & rsp->flavor_mask))
> + continue;
> + smp_mb(); /* ->flavor_mask before ->cond_resched_completed. */
> + if (ACCESS_ONCE(rdp->mynode->completed) !=
> + ACCESS_ONCE(rdp->cond_resched_completed))
> + continue;
Probably the comment above mb() meant "rcu_cond_resched_mask before
->cond_resched_completed" ? Otherwise I can't see why do we need any
barrier.
> @@ -893,13 +946,20 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp,
> }
>
> /*
> - * There is a possibility that a CPU in adaptive-ticks state
> - * might run in the kernel with the scheduling-clock tick disabled
> - * for an extended time period. Invoke rcu_kick_nohz_cpu() to
> - * force the CPU to restart the scheduling-clock tick in this
> - * CPU is in this state.
> + * A CPU running for an extended time within the kernel can
> + * delay RCU grace periods. When the CPU is in NO_HZ_FULL mode,
> + * even context-switching back and forth between a pair of
> + * in-kernel CPU-bound tasks cannot advance grace periods.
> + * So if the grace period is old enough, make the CPU pay attention.
> */
> - rcu_kick_nohz_cpu(rdp->cpu);
> + if (ULONG_CMP_GE(jiffies, rdp->rsp->gp_start + 7)) {
> + rcrmp = &per_cpu(rcu_cond_resched_mask, rdp->cpu);
> + ACCESS_ONCE(rdp->cond_resched_completed) =
> + ACCESS_ONCE(rdp->mynode->completed);
> + smp_mb(); /* ->cond_resched_completed before *rcrmp. */
> + ACCESS_ONCE(*rcrmp) =
> + ACCESS_ONCE(*rcrmp) + rdp->rsp->flavor_mask;
> + }
OK, in this case I guess we need a full barrier because we need to read
rcu_cond_resched_mask before updating it...
But, I am just curious, is there any reason to use ACCESS_ONCE() twice?
ACCESS_ONCE(*rcrmp) |= rdp->rsp->flavor_mask;
or even
ACCESS_ONCE(per_cpu(rcu_cond_resched_mask, rdp->cpu)) |=
rdp->rsp->flavor_mask;
should equally work, or ACCESS_ONCE() can't be used to RMW ?
(and in fact at least the 2nd ACCESS_ONCE() (load) looks unnecessary anyway
because of smp_mb() above).
Once again, of course I am not arguing if there is no "real" reason and you
just prefer it this way. But the kernel has more and more ACESS_ONCE() users
and sometime I simply do not understand why it is needed. For example,
cyc2ns_write_end().
Or even INIT_LIST_HEAD_RCU(). The comment in list_splice_init_rcu() says:
/*
* "first" and "last" tracking list, so initialize it. RCU readers
* have access to this list, so we must use INIT_LIST_HEAD_RCU()
* instead of INIT_LIST_HEAD().
*/
INIT_LIST_HEAD_RCU(list);
but we are going to call synchronize_rcu() or something similar, this should
act as compiler barrier too?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists