[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160921155826.GB8408@pathway.suse.cz>
Date: Wed, 21 Sep 2016 17:58:27 +0200
From: Petr Mladek <pmladek@...e.com>
To: Matt Fleming <matt@...eblueprint.co.uk>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Byungchul Park <byungchul.park@....com>,
Frederic Weisbecker <fweisbec@...il.com>,
Luca Abeni <luca.abeni@...tn.it>,
Rik van Riel <riel@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Wanpeng Li <wanpeng.li@...mail.com>,
Yuyang Du <yuyang.du@...el.com>, Jan Kara <jack@...e.cz>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
linux-kernel@...r.kernel.org,
Mel Gorman <mgorman@...hsingularity.net>,
Mike Galbraith <umgwanakikbuti@...il.com>
Subject: Re: [PATCH v2 7/7] sched/core: Add debug code to catch missing
update_rq_clock()
On Wed 2016-09-21 14:38:13, Matt Fleming wrote:
> There's no diagnostic checks for figuring out when we've accidentally
> missed update_rq_clock() calls. Let's add some by piggybacking on the
> rq_*pin_lock() wrappers.
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index bf48e7975c23..91f4b3d58d56 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> +/*
> + * rq::clock_update_flags bits
> + *
> + * %RQCF_REQ_SKIP - will request skipping of clock update on the next
> + * call to __schedule(). This is an optimisation to avoid
> + * neighbouring rq clock updates.
> + *
> + * %RQCF_ACT_SKIP - is set from inside of __schedule() when skipping is
> + * in effect and calls to update_rq_clock() are being ignored.
> + *
> + * %RQCF_UPDATED - is a debug flag that indicates whether a call has been
> + * made to update_rq_clock() since the last time rq::lock was pinned.
> + *
> + * If inside of __schedule(), clock_update_flags will have been
> + * shifted left (a left shift is a cheap operation for the fast path
> + * to promote %RQCF_REQ_SKIP to %RQCF_ACT_SKIP), so you must use,
> + *
> + * if (rq-clock_update_flags >= RQCF_UPDATED)
> + *
> + * to check if %RQCF_UPADTED is set. It'll never be shifted more than
> + * one position though, because the next rq_unpin_lock() will shift it
> + * back.
> + */
> +#define RQCF_REQ_SKIP 0x01
> +#define RQCF_ACT_SKIP 0x02
> +#define RQCF_UPDATED 0x04
> +
> +static inline void assert_clock_updated(struct rq *rq)
> +{
> +#ifdef CONFIG_SCHED_DEBUG
> + /*
> + * The only reason for not seeing a clock update since the
> + * last rq_pin_lock() is if we're currently skipping updates.
> + */
> + WARN_ON_ONCE(rq->clock_update_flags < RQCF_ACT_SKIP);
> +#endif
> +}
I am afraid that it might eventually create a deadlock.
For example, there is the following call chain:
+ printk()
+ vprintk_func -> vprintk_default()
+ vprinkt_emit()
+ console_unlock()
+ up_console_sem()
+ up() # takes &sem->lock
+ __up()
+ wake_up_process()
+ try_to_wake_up()
+ ttwu_queue()
+ ttwu_do_activate()
+ ttwu_do_wakeup()
+ rq_clock()
+ lockdep_assert_held()
+ WARN_ON_ONCE()
+ printk()
+ vprintk_func -> vprintk_default()
+ vprintk_emit()
+ console_try_lock()
+ down_trylock_console_sem()
+ __down_trylock_console_sem()
+ down_trylock()
DEADLOCK: Unable to take &sem->lock
We have recently discussed similar deadlock, see the thread
around https://lkml.kernel.org/r/20160714221251.GE3057@ubuntu
A temporary solution would be to replace the WARN_ON_ONCE()
by printk_deferred(). Of course, this is far from ideal because
you do not get the stack, ...
Sergey is working on WARN_ON_ONCE_DEFERRED() but it is not
an easy task.
> static inline u64 rq_clock(struct rq *rq)
> {
> lockdep_assert_held(&rq->lock);
> + assert_clock_updated(rq);
> +
> return rq->clock;
> }
>
I am not sure how the above call chain is realistic. But adding
WARN_ON() into the scheduler paths is risky in general.
Best Regards,
Petr
Powered by blists - more mailing lists