lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130218163135.GE12679@quack.suse.cz>
Date:	Mon, 18 Feb 2013 17:31:35 +0100
From:	Jan Kara <jack@...e.cz>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Jan Kara <jack@...e.cz>, Steven Rostedt <rostedt@...dmis.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Frederic Weisbecker <fweisbec@...il.com>, jslaby@...e.cz,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"kay.sievers" <kay.sievers@...y.org>
Subject: Re: [PATCH 3/3] printk: Avoid softlockups in console_unlock()

On Fri 15-02-13 14:22:19, Andrew Morton wrote:
> On Fri, 15 Feb 2013 17:57:10 +0100
> Jan Kara <jack@...e.cz> wrote:
> 
> > A CPU can be caught in console_unlock() for a long time (tens of seconds are
> > reported by our customers) when other CPUs are using printk heavily and serial
> > console makes printing slow. Despite serial console drivers are calling
> > touch_nmi_watchdog() this triggers softlockup warnings because
> > interrupts are disabled for the whole time console_unlock() runs (e.g.
> > vprintk() calls console_unlock() with interrupts disabled). Thus IPIs
> > cannot be processed and other CPUs get stuck spinning in calls like
> > smp_call_function_many(). Also RCU eventually starts reporting lockups.
> > 
> > In my artifical testing I also managed to trigger a situation when disk
> > disappeared from the system apparently because commands to / from it
> > could not be delivered for long enough. This is why just silencing
> > watchdogs isn't a reliable solution to the problem and we simply have to
> > avoid spending too long in console_unlock().
> > 
> > We fix the issue by limiting the time we spend in console_unlock() to
> > watchdog_thresh() / 4 (unless we are in an early boot stage or oops is
> > happening). The rest of the buffer will be printed either by further
> > callers to printk() or during next timer tick.
> > 
> 
> It still gives me tummy ache :(
  But it's better than it used to be, isn't it? At least I like this
version more than the one with postponing to worker thread since we only
depend on timer ticks to occur...

> The patch adds additional tests of oops_in_progress.  Some description
> of your thinking on that matter would be appropriate?
  Good point, I'll add that. My thinking was that when we are oopsing, all
bets are off and we want to get the messages to console as reliably as
possible and we don't care about soflockups anymore as we have bigger
trouble anyway.

> > --- a/kernel/printk.c
> > +++ b/kernel/printk.c
> > @@ -1990,17 +1990,31 @@ int is_console_locked(void)
> >  #define PRINTK_PENDING_OUTPUT	2
> >  
> >  static unsigned long printk_pending;
> > +static int last_printing_cpu = -1;
> > +
> > +static bool __console_unlock(void);
> >  
> >  void printk_tick(void)
> 
> printk_tick() no longer exists in linux-next.
  Thanks for notice, I'll rebase and fix this up.

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ