linux-kernel - Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f02a71bc-0867-be60-182b-10d7377b2b04@i-love.sakura.ne.jp>
Date:   Mon, 25 May 2020 19:43:04 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Petr Mladek <pmladek@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, Dmitry Vyukov <dvyukov@...gle.com>,
        Ondrej Mosnacek <omosnace@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into
 printk(KERN_DEBUG)

On 2020/05/25 17:42, Petr Mladek wrote:
> I see few drawbacks with this patch:
> 
> 1. It will cause adding much more messages into the logbuffer even
>    though they are not flushed to the console. It might cause that
>    more important messages will get overridden before they reach
>    console. They might also make hard to read the full log.

Since the user of this twist option will select console loglevel in a way
KERN_DEBUG messages are not printed to consoles, KERN_DEBUG messages will
be immediately processed (and space for future messages will be reclaimed).
Therefore, I don't think that more important messages will get overridden.

This twist option might increase possibility of mixing KERN_DEBUG messages
and non-KERN_DEBUG messages due to KERN_CONT case.

But if these concerns turn out to be a real problem, we can redirect
pr_devel()/pr_debug() to simple snprintf() which evaluates arguments
but discards the result without storing into the logbuffer.

> 
> 2. Crash inside printk() causes recursive messages. They are currently
>    printed into the printk_safe() buffers and there is a bigger risk
>    that they will not reach the console.

Currently "static char textbuf[LOG_LINE_MAX];" is "static" because it is used
under logbuf_lock. If we remove "static", we can use "char textbuf[LOG_LINE_MAX];"
without logbuf_lock. Then, we can bring potentially dangerous-and-slow vscnprintf()
in vprintk_store() to earlier stage (and vprintk_store() will need to do simple
copy) so that oops in printk() will happen before entering printk-safe context.
I think that this change follows a direction which lockless logbuf will want.

> 
> 3. pr_debug() messages are not printed by default. It is possible that
>    nobody used them for ages. You might get many errors in less
>    maintained code instead in the really used one. I mean that you
>    will get more noise with less gain.

Given that potentially dangerous-and-slow vscnprintf() is done outside of
printk-safe context, we can get more test coverage without difficult things.

> 
> 
> Have you tested this patch by the syzcaller with many runs, please?
> Did it helped to actually discover more bugs?
> Did it really made things easier?

syzbot can't test with custom patches. The only way to test this patch is
to send to e.g. linux-next.git which syzbot is testing.

> 
> I am not able to judge usefulness without more data. My intuition
> tells me that we should keep the number of syzcaller-related twists
> as small as possible. Otherwise, syscaller will diverge more and
> more from reality.

The twist options are not specific to syzkaller. Anyone can selectively
enable the twist options.

On 2020/05/25 18:11, Sergey Senozhatsky wrote:
> On (20/05/25 10:42), Petr Mladek wrote:
>> On Sun 2020-05-24 23:50:34, Tetsuo Handa wrote:
>>> syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
>>> ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
>>> because pr_debug() was no-op [1].
>>>
>>>   pr_debug("fallback-read subflow=%p",
>>>            mptcp_subflow_ctx(ssock->sk));
>>>   copied = sock_recvmsg(ssock, msg, flags);
>>
>> The NULL pointer deference was found even without this patch.
>> This patch would just cause that it will manifest itself on another
>> place. What is the benefit, please?

It would help localizing the bug in this specific case.

It's not only about %p, even %d can crash kernel or leak sensitive
info (if it happens after-free/out-of-bounds/uninit). Overall it
increases code coverage and allows to catch more bugs earlier. 

> 
> Right, I don't get this patch. A NULL-deref is still a NULL pointer deref.
> pr_debug() will fault reading one byte from the address and print something
> like "fallback-read subflow=(efault)" to printk-safe buffer, but then
> sock_recvmsg() is still going to do its thing.

Since this NULL pointer dereference already happens before calling pr_debug(),
we won't store "fallback-read subflow=(efault)" to printk-safe buffer.

Just evaluating pr_devel()/pr_debug() arguments would help finding some bugs.
Again, we can change this twist option to redirect pr_devel()/pr_debug() to
simple snprintf() which evaluates arguments but discards the result without
storing into the logbuffer.