[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140424070726.GY26782@laptop.programming.kicks-ass.net>
Date: Thu, 24 Apr 2014 09:07:26 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Denys Vlasenko <dvlasenk@...hat.com>
Cc: linux-kernel@...r.kernel.org,
Frederic Weisbecker <fweisbec@...il.com>,
Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
Fernando Luis Vazquez Cao <fernando_b1@....ntt.co.jp>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Arjan van de Ven <arjan@...ux.intel.com>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH 3/3] nohz: Fix iowait overcounting if iowait task migrates
On Wed, Apr 23, 2014 at 09:00:36PM +0200, Denys Vlasenko wrote:
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9cae286..08dd220 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4255,6 +4255,9 @@ EXPORT_SYMBOL_GPL(yield_to);
> */
> void __sched io_schedule(void)
> {
> +#ifdef CONFIG_NO_HZ_COMMON
> + int cpu_on_entry = smp_processor_id();
> +#endif
> struct rq *rq = raw_rq();
>
> delayacct_blkio_start();
> @@ -4263,13 +4266,23 @@ void __sched io_schedule(void)
> current->in_iowait = 1;
> schedule();
> current->in_iowait = 0;
> +#ifdef CONFIG_NO_HZ_COMMON
> + if (atomic_dec_and_test(&rq->nr_iowait)) {
> + if (smp_processor_id() != cpu_on_entry)
> + tick_nohz_iowait_to_idle(cpu_on_entry);
> + }
> +#else
> atomic_dec(&rq->nr_iowait);
> +#endif
> delayacct_blkio_end();
> }
> EXPORT_SYMBOL(io_schedule);
>
> long __sched io_schedule_timeout(long timeout)
> {
> +#ifdef CONFIG_NO_HZ_COMMON
> + int cpu_on_entry = smp_processor_id();
> +#endif
> struct rq *rq = raw_rq();
> long ret;
>
> @@ -4279,7 +4292,14 @@ long __sched io_schedule_timeout(long timeout)
> current->in_iowait = 1;
> ret = schedule_timeout(timeout);
> current->in_iowait = 0;
> +#ifdef CONFIG_NO_HZ_COMMON
> + if (atomic_dec_and_test(&rq->nr_iowait)) {
> + if (smp_processor_id() != cpu_on_entry)
> + tick_nohz_iowait_to_idle(cpu_on_entry);
> + }
> +#else
> atomic_dec(&rq->nr_iowait);
> +#endif
> delayacct_blkio_end();
> return ret;
> }
Why do you insist on writing the same (buggy, see later) code twice?
Really, get lazy already and write it once but use it twice!
Its buggy because the smp_processor_id() is used in preemptible context,
its further buggy because the raw_rq() does it again and could get a rq
on a different cpu.
What you want is something like:
static inline void io_wait_start(struct rq *rq)
{
atomic_inc(&rq->nr_iowait);
current->in_iowait = 1;
}
static inline void io_wait_end(struct rq *rq)
{
current->in_iowait = 0;
#ifdef CONFIG_NO_HZ_COMMON
if (atomic_dec_and_test(&rq->nr_iowait) &&
cpu_of(rq) != raw_smp_processor_id()) {
tick_nohz_iowait_end(cpu_of(rq));
}
#else
atomic_dec(&rq->nr_iowait);
#endif
}
Anyway, I suspect that's still broken and you really need that lock
around the state like I did earlier, because the above isn't serialized
between remote wakeup and the cpu waking from nohz.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists