linux-kernel - Re: [patch RFC 00/29] printk: A new approach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220911090155.GK4315@paulmck-ThinkPad-P17-Gen-1>
Date:   Sun, 11 Sep 2022 02:01:55 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        John Ogness <john.ogness@...utronix.de>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Linus Torvalds <torvalds@...uxfoundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Daniel Vetter <daniel@...ll.ch>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Helge Deller <deller@....de>,
        Jason Wessel <jason.wessel@...driver.com>,
        Daniel Thompson <daniel.thompson@...aro.org>
Subject: Re: [patch RFC 00/29] printk: A new approach - WIP

On Sun, Sep 11, 2022 at 12:27:31AM +0200, Thomas Gleixner wrote:

[ . . . ]

> The infrastructure we implemented comes with the following properties:
> 
>   - It shares the console list, but only for registration/unregistration
>     purposes. Consoles which are utilizing the new infrastructure are
>     ignored by the existing mechanisms and vice versa. This allows to
>     reuse as much code as possible and preserves the printk semantics
>     except for the threaded printing part.
> 
>   - The console list walk becomes SRCU protected to avoid any restrictions
>     on contexts

I am guessing that this means that you need an NMI-safe srcu_read_lock()
and srcu_read_unlock().  If my guess is correct, please let me know,
and I will create one for you.  (As it stands, these are NMI-safe on x86,
but not on architectures using the asm-generic variant of this_cpu_inc().

The result would be srcu_read_lock_nmi() and srcu_read_unlock_nmi()
or similar.  There would need to be something to prevent mixing of
srcu_read_lock() and srcu_read_lock_nmi().

Or are you somehow avoiding ever invoking either srcu_read_lock() or
srcu_read_unlock() from NMI context?

For example, are we simply living dangerously with NMI-based stack traces
as suggested below?  ;-)

>   - Consoles become stateful to handle handover and takeover in a graceful
>     way.
> 
>   - All console state operations rely solely on atomic*_try_cmpxchg() so
>     they work in any context.
> 
>   - Console locking is per console to allow friendly handover or "safe"
>     hostile takeover in emergency/panic situations. Console lock is not
>     relevant for consoles converted to the new infrastructure.
> 
>   - The core provides interfaces for console drivers to query whether they
>     can proceed and to denote 'unsafe' sections in the console state, which
>     is unavoidable for some console drivers.
> 
>     In fact there is not a single regular (non-early) console driver today
>     which is reentrancy safe under all circumstances, which enforces that
>     NMI context is excluded from printing directly. TBH, that's a sad state
>     of affairs.
> 
>     The unsafe state bit allows to make informed decisions in the core
>     code, e.g. to postpone printing if there are consoles available which
>     are safe to acquire. In case of a hostile takeover the unsafe state bit
>     is handed to the atomic write callback so that the console driver can
>     act accordingly.
>     
>   - Printing is delegated to a per console printer thread except for the
>     following situations:
> 
>        - Early boot
>        - Emergency printing (WARN/OOPS)
>        - Panic printing
> 
> The integration is complete, but without any fancy things, like locking all
> consoles when entering a WARN, print and unlock when done. Such things only
> make sense once all drivers are converted over because that conflicts with
> the way how the existing console lock mechanics work.
> 
> For testing we used the most simple driver: a hacked up version of the
> early uart8250 console as we wanted to concentrate on validating the core
> mechanisms of friendly handover and hostile takeovers instead of dealing
> with the horrors of port locks or whatever at the same time. That's the
> next challenge. Hack patch will be provided in a reply.
> 
> Here is sample output where we let the atomic and thread write functions
> prepend each line with the printing context (A=atomic, T=thread):
> 
> A[    0.394066] ... fixed-purpose events:   3
> A[    0.395130] ... event mask:             000000070000000f
> 
> End of early boot, thread starts
> 
> TA[    0.396821] rcu: Hierarchical SRCU implementation.
> 
> ^^ Thread starts printing and immediately raises a warning, so atomic
>    context at the emergency priority takes over and continues printing.
> 
>    This is a forceful, but safe takeover scenario as the WARN context
>    is obviously on the same CPU as the printing thread where friendly
>    is not an option.
> 
> A[    0.397133] rcu: 	Max phase no-delay instances is 400.
> A[    0.397640] ------------[ cut here ]------------
> A[    0.398072] WARNING: CPU: 0 PID: 13 at drivers/tty/serial/8250/8250_early.c:123 __early_serial8250_write.isra.0+0x80/0xa0
> ....
> A[    0.440131]  ret_from_fork+0x1f/0x30
> A[    0.441133]  </TASK>
> A[    0.441867] ---[ end trace 0000000000000000 ]---
> T[    0.443493] smp: Bringing up secondary CPUs ...
> 
> After the warning the thread continues printing.
> 
> ....
> 
> T[    1.916873] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
> T[    1.918719] md: Waiting for all devices to be available before autod
> 
> A[    1.918719] md: Waiting for all devices to be available before autodetect
> 
> System panics because it can't find a root file system. Panic printing
> takes over the console from the printer thread immediately and reprints
> the interrupted line.
> 
> This case is a friendly handover from the printing thread to the panic
> context because the printing thread was not running on the panic CPU, but
> handed over gracefully.
> 
> A[    1.919942] md: If you don't use raid, use raid=noautodetect
> A[    1.921030] md: Autodetecting RAID arrays.
> A[    1.921919] md: autorun ...
> A[    1.922686] md: ... autorun DONE.
> A[    1.923761] /dev/root: Can't open blockdev
> 
> So far the implemented state handling machinery holds up on the various
> handover and hostile takeover situations we enforced for testing.
> 
> Hostile takeover is nevertheless a special case. If the driver is in an
> unsafe region that's something which needs to be dealt with per driver.
> There is not much the core code can do here except of trying a friendly
> handover first and only enforcing it after a timeout or not trying to print
> on such consoles.
> 
> This needs some thought, but we explicitely did not implement any takeover
> policy into the core state handling mechanism as this is really a decision
> which needs to be made at the call site. See patch 28.
> 
> We are soliciting feedback on that approach and we hope that we can
> organize a BOF in Dublin on short notice.

Last I checked, there were still some slots open.

> The series is also available from git:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git printk
> 
> The series has the following parts:
> 
>    Patches  1 - 5:   Cleanups
> 
>    Patches  6 - 12:  Locking and list conversion
> 
>    Patches 13 - 18:  Improved output buffer handling to prepare for
>    	      	     code sharing
> 
>    Patches 19 - 29:  New infrastructure implementation
> 
> Most of the preparatory patches 1-18 have probably a value on their own.
> 
> Don't be scared about the patch stats below. We added kernel doc and
> extensive comments to the code:
> 
>   kernel/printk/printk_nobkl.c: Code: 668 lines, Comments: 697 lines, Ratio: 1:1.043

When I reach stable AC, I will fire off some rcutorture tests.  From the
discussion above, it sounds like I should expect some boot-time warnings?
Or were those strictly printk-specific testing?

> Of course the code is trivial and straight forward as any other facility
> which has to deal with concurrency and the twist of being safe in any
> context. :)

;-) ;-) ;-)

							Thanx, Paul

> Comments welcome.
> 
> Thanks,
> 
> 	tglx
> ---
> [1] https://lore.kernel.org/lkml/87r11qp63n.ffs@tglx/
> ---
>  arch/parisc/include/asm/pdc.h  |    2 
>  arch/parisc/kernel/pdc_cons.c  |   53 -
>  arch/parisc/kernel/traps.c     |   17 
>  b/kernel/printk/printk_nobkl.c | 1564 +++++++++++++++++++++++++++++++++++++++++
>  drivers/tty/serial/kgdboc.c    |    7 
>  drivers/tty/tty_io.c           |    6 
>  fs/proc/consoles.c             |   12 
>  fs/proc/kmsg.c                 |    2 
>  include/linux/console.h        |  375 +++++++++
>  include/linux/printk.h         |    9 
>  include/linux/syslog.h         |    3 
>  kernel/debug/kdb/kdb_io.c      |    7 
>  kernel/panic.c                 |   12 
>  kernel/printk/printk.c         |  485 ++++++++----
>  14 files changed, 2304 insertions(+), 250 deletions(-)