linux-kernel - Re: perf: fuzzer BUG: KASAN: stack-out-of-bounds in __unwind

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161130100129.GD24060@pathway.suse.cz>
Date:   Wed, 30 Nov 2016 11:01:29 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        "dvyukov@...gle.com" <dvyukov@...gle.com>
Subject: Re: perf: fuzzer BUG: KASAN: stack-out-of-bounds in __unwind_start

On Tue 2016-11-29 18:10:38, Peter Zijlstra wrote:
> On Tue, Nov 29, 2016 at 05:29:20PM +0100, Petr Mladek wrote:
> 
> > > > People are very busy polishing the turd we call printk, but from where
> > > > I'm sitting its terminally and unfixably broken.
> > 
> > I still hope that we could do better :-)
> 
> How? The console drivers are a complete trainwreck, you simply cannot
> build anything sensible ontop of a trainwreck.

I am afraid that I will not persuade you but...

> And from what I understood from talking to someone (I again forgot who)
> at LPC, the whole reason people were poking at this is that the block
> layer (or something thereabouts) prints a gazillion lines of crap when
> you attach a stupid amount of devices (through FC or other SAN like
> things).

This is crazy indeed if it happens on a production system.

> The way we've 'fixed' that in the scheduler (a fairly long time ago)
> when SGI complained about our printks taking too long (because they had
> 4096 CPUs), is to simply remove the printks (they're now hidden behind
> the sched_debug boot param).

This is a solution. But what if you want to enable debugging and the
system does not boot because the printing takes too long.

> In any case, as long as printk has a globally serialized 'log', it, per
> design, will be worse than the console drivers its build upon. And them
> being shit precludes the entire stack from being useful.

I probably still do not understand all the problems with console
drivers. My understanding is that the problem is that they have
its own locking and are slow. It means that they are prone to
a deadlock and they might block for a long time.

In compare, the serialized log buffer has one lock and writing
is fast. It means that it suffers "only" from the deadlocks.
And we try to address the deadlocks by using the temporary
per-CPU buffers in critical situations (NMI, locked sections).

Of course, it is useless if you have the messages in a buffer
and can't reach them. But we do the best effort to push them
to consoles and crash dump. Also it might be very useful to
have the log buffer on persistent memory.

> It mostly works, most of the time, and that seems to be what Linus
> wants, since its really the best we can have given the constraints. But
> for debugging, when you have a UART, it totally blows.

I believe that the early console is the last resort for debugging
some type of bugs. But many other bugs can be debugged with the
classic printk(). And there are (production) systems where you
cannot (easily) or do not want to use early printk all the time.

Another question is the complexity of the printk() code. Especially,
the big effort to get "perfect" (non-mixed) output is questionable.

Best Regards,
Petr