linux-kernel - Re: [PATCH printk v2 09/11] panic: Add atomic write enforcement to oops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZRQq2ZqMN34qLs44@alley>
Date:   Wed, 27 Sep 2023 15:15:05 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, Kees Cook <keescook@...omium.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Arnd Bergmann <arnd@...db.de>,
        "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
Subject: Re: [PATCH printk v2 09/11] panic: Add atomic write enforcement to
 oops

On Wed 2023-09-20 01:14:54, John Ogness wrote:
> Invoke the atomic write enforcement functions for oops to
> ensure that the information gets out to the consoles.
> 
> Since there is no single general function that calls both
> oops_enter() and oops_exit(), the nesting feature of atomic
> write sections is taken advantage of in order to guarantee
> full coverage between the first oops_enter() and the last
> oops_exit().
> 
> It is important to note that if there are any legacy consoles
> registered, they will be attempting to directly print from the
> printk-caller context, which may jeopardize the reliability of
> the atomic consoles. Optimally there should be no legacy
> consoles registered.
> 
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -630,6 +634,36 @@ bool oops_may_print(void)
>   */
>  void oops_enter(void)
>  {
> +	enum nbcon_prio prev_prio;
> +	int cpu = -1;
> +
> +	/*
> +	 * If this turns out to be the first CPU in oops, this is the
> +	 * beginning of the outermost atomic section. Otherwise it is
> +	 * the beginning of an inner atomic section.
> +	 */

This sounds strange. What is the advantage of having the inner
atomic context, please? It covers only messages printed inside
oops_enter() and not the whole oops_enter()/exit(). Also see below.

> +	prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
> +
> +	if (atomic_try_cmpxchg_relaxed(&oops_cpu, &cpu, smp_processor_id())) {
> +		/*
> +		 * This is the first CPU in oops. Save the outermost
> +		 * @prev_prio in order to restore it on the outermost
> +		 * matching oops_exit(), when @oops_nesting == 0.
> +		 */
> +		oops_prev_prio = prev_prio;
> +
> +		/*
> +		 * Enter an inner atomic section that ends at the end of this
> +		 * function. In this case, the nbcon_atomic_enter() above
> +		 * began the outermost atomic section.
> +		 */
> +		prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
> +	}
> +
> +	/* Track nesting when this CPU is the owner. */
> +	if (cpu == -1 || cpu == smp_processor_id())
> +		oops_nesting++;
> +
>  	tracing_off();
>  	/* can't trust the integrity of the kernel anymore: */
>  	debug_locks_off();
> @@ -637,6 +671,9 @@ void oops_enter(void)
>  
>  	if (sysctl_oops_all_cpu_backtrace)
>  		trigger_all_cpu_backtrace();
> +
> +	/* Exit inner atomic section. */
> +	nbcon_atomic_exit(NBCON_PRIO_EMERGENCY, prev_prio);

This will not flush the messages when:

   + This CPU owns oops_cpu. The flush will have to wait for exiting
     the outer loop.

     In this case, the inner atomic context is not needed.


   + oops_cpu is owner by another CPU, the other CPU is
     just flushing the messages and block the per-console
     lock.

     The good thing is that the messages printed by this oops_enter()
     would likely get flushed by the other CPU.

     The bad thing is that oops_exit() on this CPU won't call
     nbcon_atomic_exit() so that the following OOPS messages
     from this CPU might need to wait for the printk kthread.
     IMHO, this is not what we want.


One solution would be to store prev_prio in per-CPU array
so that each CPU could call its own nbcon_atomic_exit().

But I start liking more and more the idea with storing
and counting nested emergency contexts in struct task_struct.
It is the alternative implementation in reply to the 7th patch,
https://lore.kernel.org/r/ZRLBxsXPCym2NC5Q@alley

Then it will be enough to simply call:

   + nbcon_emergency_enter() in oops_enter()
   + nbcon_emergency_exit() in oops_enter()

Best Regards,
Petr

PS: I just hope that you didn't add all this complexity just because
    we preferred this behavior at LPC 2022. Especially I hope
    that it was not me who proposed and preferred this.