lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161130100129.GD24060@pathway.suse.cz>
Date:   Wed, 30 Nov 2016 11:01:29 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        "dvyukov@...gle.com" <dvyukov@...gle.com>
Subject: Re: perf: fuzzer BUG: KASAN: stack-out-of-bounds in __unwind_start

On Tue 2016-11-29 18:10:38, Peter Zijlstra wrote:
> On Tue, Nov 29, 2016 at 05:29:20PM +0100, Petr Mladek wrote:
> 
> > > > People are very busy polishing the turd we call printk, but from where
> > > > I'm sitting its terminally and unfixably broken.
> > 
> > I still hope that we could do better :-)
> 
> How? The console drivers are a complete trainwreck, you simply cannot
> build anything sensible ontop of a trainwreck.

I am afraid that I will not persuade you but...


> And from what I understood from talking to someone (I again forgot who)
> at LPC, the whole reason people were poking at this is that the block
> layer (or something thereabouts) prints a gazillion lines of crap when
> you attach a stupid amount of devices (through FC or other SAN like
> things).

This is crazy indeed if it happens on a production system.


> The way we've 'fixed' that in the scheduler (a fairly long time ago)
> when SGI complained about our printks taking too long (because they had
> 4096 CPUs), is to simply remove the printks (they're now hidden behind
> the sched_debug boot param).

This is a solution. But what if you want to enable debugging and the
system does not boot because the printing takes too long.


> In any case, as long as printk has a globally serialized 'log', it, per
> design, will be worse than the console drivers its build upon. And them
> being shit precludes the entire stack from being useful.

I probably still do not understand all the problems with console
drivers. My understanding is that the problem is that they have
its own locking and are slow. It means that they are prone to
a deadlock and they might block for a long time.

In compare, the serialized log buffer has one lock and writing
is fast. It means that it suffers "only" from the deadlocks.
And we try to address the deadlocks by using the temporary
per-CPU buffers in critical situations (NMI, locked sections).

Of course, it is useless if you have the messages in a buffer
and can't reach them. But we do the best effort to push them
to consoles and crash dump. Also it might be very useful to
have the log buffer on persistent memory.


> It mostly works, most of the time, and that seems to be what Linus
> wants, since its really the best we can have given the constraints. But
> for debugging, when you have a UART, it totally blows.

I believe that the early console is the last resort for debugging
some type of bugs. But many other bugs can be debugged with the
classic printk(). And there are (production) systems where you
cannot (easily) or do not want to use early printk all the time.

Another question is the complexity of the printk() code. Especially,
the big effort to get "perfect" (non-mixed) output is questionable.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ