[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20150528130944.9dde0f591a18d656f2a7c519@linux-foundation.org>
Date: Thu, 28 May 2015 13:09:44 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Petr Mladek <pmladek@...e.cz>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Dave Anderson <anderson@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Kay Sievers <kay@...y.org>, Jiri Kosina <jkosina@...e.cz>,
Michal Hocko <mhocko@...e.cz>, Jan Kara <jack@...e.cz>,
linux-kernel@...r.kernel.org, Wang Long <long.wanglong@...wei.com>,
peifeiyue@...wei.com, dzickus@...hat.com, morgan.wang@...wei.com,
sasha.levin@...cle.com
Subject: Re: [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI
On Thu, 28 May 2015 15:50:54 +0200 Petr Mladek <pmladek@...e.cz> wrote:
> > > +{
> > > + u64 start_time, current_time;
> > > + int this_cpu = smp_processor_id();
> > > +
> > > + /* no way if we are already locked on this CPU */
> > > + if (logbuf_cpu == this_cpu)
> > > + return 0;
> > > +
> > > + /* try hard to get the lock but do not wait forever */
> > > + start_time = cpu_clock(this_cpu);
> > > + current_time = start_time;
> > > + while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> > > + if (raw_spin_trylock(&logbuf_lock))
> > > + return 1;
> > > + cpu_relax();
> > > + current_time = cpu_clock(this_cpu);
> > > + }
> >
> > (Looks at the read_seqcount_retry() in
> > kernel/time/sched_clock.c:sched_clock())
> >
> > Running cpu_clock() in NMI context seems a generally bad idea.
>
> I am sorry but this is too cryptic for me :-)
> read_seqcount_retry() looks safe to me under NMI.
hmpf. If you guys say so...
Note that it's not just a matter of "safe to call from NMI context".
The above loop also assume that cpu_clock() is *being updated* within
the context of single NMI. Is that true/safe now and in the future?
Probably. I didn't check all architectures but ARM looks OK at present.
We should at least update Documentation/timers/timekeeping.txt: "a sane
value" becomes "the correct value", no alternatives.
> > There are many sites in kernel/printk/printk.c which take logbuf_lock,
> > but this patch only sets logbuf_cpu in one of those cases:
> > vprintk_emit(). I suggest adding helper functions to take/release
> > logbuf_lock. And rename logbuf_lock to something else to ensure that
> > nobody accidentally takes the lock directly.
>
> IMHO, vprintk_emit() is special. It is the only location where the
> lock is taken in NMI context. The other functions are used to dump
> @logbuf and are called in normal context.
>
> try_logbuf_lock_in_nmi() could fail and we need to handle the error
> path. We do not need to do this in the other locations.
>
> Note that we do not want to get the console in NMI because
> there are even more locks that might cause a deadlock.
Consider the case where a CPU has taken logbuf_lock within
devkmsg_read() and then receives an NMI, from which it calls
try_logbuf_lock_in_nmi():
> +/* We must be careful in NMI when we managed to preempt a running printk */
> +static int try_logbuf_lock_in_nmi(void)
> +{
> + u64 start_time, current_time;
> + int this_cpu = smp_processor_id();
> +
> + /* no way if we are already locked on this CPU */
> + if (logbuf_cpu == this_cpu)
> + return 0;
> +
> + /* try hard to get the lock but do not wait forever */
> + start_time = cpu_clock(this_cpu);
> + current_time = start_time;
> + while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> + if (raw_spin_trylock(&logbuf_lock))
> + return 1;
> + cpu_relax();
> + current_time = cpu_clock(this_cpu);
> + }
> +
> + return 0;
> +}
That CPU is now going to spin around for 100us and then time out.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists