linux-kernel - Re: + printk-print-initial-logbuf-contents-before-re-enabling-interrupts.patch added to -mm tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140506200022.GB27469@quack.suse.cz>
Date:	Tue, 6 May 2014 22:00:22 +0200
From:	Jan Kara <jack@...e.cz>
To:	Will Deacon <will.deacon@....com>
Cc:	Jan Kara <jack@...e.cz>,
	"mm-commits@...r.kernel.org" <mm-commits@...r.kernel.org>,
	"peterz@...radead.org" <peterz@...radead.org>,
	"kay@...y.org" <kay@...y.org>, LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: +
 printk-print-initial-logbuf-contents-before-re-enabling-interrupts.patch
 added to -mm tree

On Tue 06-05-14 16:00:37, Will Deacon wrote:
> On Tue, May 06, 2014 at 03:00:32PM +0100, Jan Kara wrote:
> > On Tue 06-05-14 14:12:34, Will Deacon wrote:
> > > On Tue, May 06, 2014 at 01:29:58PM +0100, Jan Kara wrote:
> > > >   Well, with serial console the backlog can get actually pretty big. During
> > > > boot on large machines I've seen CPUs stuck in that very loop in
> > > > console_unlock() for tens of seconds. Obviously that causes problems - e.g.
> > > > watchdog fires, RCU lockup detector fires, when interrupts are disabled,
> > > > some hardware gives up because its interrupts weren't served for too long.
> > > > All in all the machine just dies.
> > > 
> > > Right, so there's the usual compromise here between throughput and latency.
> >   I'd see that compromise if enabling & disabling interrupts would be
> > taking considerable amount of time. I don't think that was your concern,
> > was it? Maybe I just misunderstood you...
> 
> Well, that isn't the quickest operation on ARM (since it's
> self-synchronising), but I was actually referring to the ability to drain
> the log buffer (with interrupts disabled) vs the ability to service
> interrupts quickly. The moment we re-enable interrupts, we can start adding
> more messages to the buffer from the IRQ path (I didn't attempt to solve the
> multi-CPU case, as I mentioned before).
  I see. But practically the multi-CPU case is much more common than the
IRQ case, isn't it?

> > > That said, printing one message each time seems to go too far in the
> > > opposite direction for my liking, so the best bet is likely to limit the
> > > work to some fixed number of messages. Do you have any feeling for such a
> > > limit?
> >   If you really are concerned about enabling and disabling of interrupts
> > taking significant time (and it may be, I just don't know), then printing
> > couple of messages without enabling them makes sense. How many is a tricky
> > question since it depends on the console speed. I had a similar problem
> > when I was deciding in my patch when we should ask another CPU to take over
> > printing from the current CPU (to avoid the issues I've described in the
> > previous email). I was experimenting with various stuff but in the end I
> > restorted to a stupid "after X characters are printed".
> 
> Yeah, so you also end up with the same problem of tuning your heuristics.
> Peter's suggestion of X == 42 is as good as any arbitrary constant I can
> suggest, hence my snapshotting of log_next_seq originally.
  Yes I can fully understand where you came from :). I just wanted to point
out that your choice isn't a particularly good one either,
 
> > > > And the backlog builds up because while one cpu is doing the printing in
> > > > console_unlock() all the other cpus are busily adding new messages to the
> > > > buffer faster than they can be printed...
> > > 
> > > Understood, but that's also the situation without this patch (and not one
> > > that I think you can fix without hurting latency).
> >   Sure. I have a patch which transitions printing to another CPU once in a
> > while so single CPU isn't hogged for too long and that solves the issues I
> > have observed. But Alan didn't like this solution so the issue is unfixed
> > for now.
> 
> Interesting. Do you have a pointer to the thread?
  The patchset posting starts here:
https://lkml.org/lkml/2014/3/25/343

  Patch 5/8 is probably the most interesting for you (patches 1-4 are
already in the mm tree).

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/